Packt+ | Advance your knowledge in tech

You're reading from Machine Learning for Finance Principles and practice for financial insiders

Product type Paperback

Published in May 2019

Publisher Packt

ISBN-13 9781789136364

Length 456 pages

Edition 1st Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Authors (2):

Jannes Klaas

James Le

View More author details

Table of Contents (15) Chapters

Machine Learning for Finance

Contributors

Preface

Other Books You May Enjoy

1. Neural Networks and Gradient-Based Optimization

2. Applying Machine Learning to Structured Data FREE CHAPTER

3. Utilizing Computer Vision

4. Understanding Time Series

5. Parsing Textual Data with Natural Language Processing

6. Using Generative Models

7. Reinforcement Learning for Financial Markets

8. Privacy, Debugging, and Launching Your Products

9. Fighting Bias

10. Bayesian Inference and Probabilistic Programming

Index

A

activation function / A logistic regressor
active learning
- about / Using less data – active learning
- labeling budgets, using / Using labeling budgets efficiently
- leverage machines, for human labeling / Leveraging machines for human labeling
- pseudo labeling, for unlabelled data / Pseudo labeling for unlabeled data
- generative models, using / Using generative models
activity_regularizer / Regularization in Keras
adam (adaptive momentum estimation) / The Adam optimizer
adam optimizer / The Adam optimizer
advantage actor-critic (A2C) model
- about / Advantage actor-critic models
- pendulum, balancing / Learning to balance
- trading / Learning to trade
aggregate global feature statistics / Aggregate global feature statistics
Amazon Mechanical Turk (MTurk) / Fine-tuning the NER
Anaconda
- reference / Running notebooks locally
anti-discrimination law
- disparate impact / Legal perspectives
approaches, beyond image classification
- about / Computer vision beyond classification
- facial recognition / Facial recognition
- bounding box prediction / Bounding box prediction
asynchronous advantage actor-critic (A3C) / Advantage actor-critic models
attention mechanism
- about / Attention
auto-sklearn
- reference / Learning how to learn
autocorrelation / Autocorrelation
autoencoders
- about / Understanding autoencoders
- for MNIST / Autoencoder for MNIST
- for credit cards / Autoencoder for credit cards
Automatic Differentiation Variational Inference (AVI) / From probabilistic programming to deep probabilistic programming
autoregression / ARIMA
Autoregressive Integrated Moving Average (AIRMA)
- about / ARIMA
AutoWEKA
- reference / Learning how to learn
AWS deep learning AMI
- reference / Using the AWS deep learning AMI
- using / Using the AWS deep learning AMI

B

backtesting
- about / A note on backtesting
- biases / A note on backtesting
bag of words classification / Bag-of-words
batchnorm
- about / Batchnorm
Bayesian deep learning / Bayesian deep learning
Bayesian inference
- about / An intuitive guide to Bayesian inference
- flat prior / Flat prior
- < 50% prior / <50% prior
- prior / Prior and posterior
- posterior / Prior and posterior
- Markov Chain Monte Carlo / Markov Chain Monte Carlo
- stochastic volatility example / Metropolis-Hastings MCMC
- probabilistic programming, migrating to deep probabilistic programming / From probabilistic programming to deep probabilistic programming
behavioral economics / Understanding the brain through RL
Bellman equation
- about / Markov processes and the bellman equation – A more formal introduction to RL, The Bellman equation in economics
biases, backtesting
- look-Ahead bias / A note on backtesting
- survivorship biasTopicn / A note on backtesting
- psychological tolerance bias / A note on backtesting
- overfitting / A note on backtesting
bias_regularizer / Regularization in Keras
bounding box prediction
- about / Bounding box prediction
- YOLO approach / Bounding box prediction
building blocks, ConvNets in Keras
- Conv2D / Conv2D
- padding / Padding
- input shape / Input shape
- MaxPooling2D / MaxPooling2D
- flatten operation / Flatten
- dense alyers / Dense

C

catastrophes / Catastrophes are caused by multiple failures
Catch
- about / Catch – a quick guide to reinforcement learning
- playing / Training to play Catch
categorical data / Preparing the data for the Keras library
causal learning
- about / Causal learning
- causal models, obtaining / Obtaining causal models
- instrument variables / Instrument variables
- nonlinear causal models / Non-linear causal models
complex system failure
- unfairness approach / Unfairness as complex system failure
complex systems
- disadvantages / Complex systems are intrinsically hazardous systems
- executing, in degraded mode / Complex systems run in degraded mode
computers / Our journey in this book
confusion matrix
- used, for evaluating heuristic model / Evaluating with a confusion matrix
Conv1D / Conv1D
Conv2D
- about / Conv2D
- kernel size / Kernel size
- stride size / Stride size
- padding / Padding
- input shape / Input shape
- ReLu activation / ReLU activation
ConvNet
- building blocks, in Keras / The building blocks of ConvNets in Keras
- training, on MNIST / Training MNIST
- MNIST model / The model
- MNIST dataset, loading / Loading the data
- MNIST dataset, compiling / Compiling and training
- MNIST dataset, training / Compiling and training
ConvNets
- about / Convolutional Neural Networks
- filters, on MNIST / Filters on MNIST
- second filter, adding / Adding a second filter
convolve operation
- using / Examining the sample time series
count vector / Bag-of-words
covariance stationarity
- about / Different kinds of stationarity
CUDA
- reference / Installing TensorFlow
Cython documentation
- reference / Speeding up your code with Cython

D

data
- preparing / Preparing the data
data, Seq2Seq models
- about / The data
- characters, encoding / Encoding characters
data debugging
- about / Debugging data
- task eligibility, checking / How to find out whether your data is up to the task
- rules / How to find out whether your data is up to the task
- enough data situations / What to do if you don't have enough data
- unit testing / Unit testing data
- privacy, maintaining / Keeping data private and complying with regulations
- best practices / Keeping data private and complying with regulations
- preparation, for training / Preparing the data for training
- inputs, comparing to predictions / Understanding which inputs led to which predictions
data preparation
- characters, sanitizing / Sanitizing characters
- lemmatization / Lemmatization
- target, preparing / Preparing the target
- train, preparing / Preparing the training and test sets
- test set, preparing / Preparing the training and test sets
dataset / The data
Dataset API
- reference / Optimizing your pipeline
data trap / The feature engineering approach
deeper network
- creating / A deeper network
deep learning
- shortcoming / Learning how to learn
deep neural networks / All models are wrong
deployment
- about / Deployment
- product launch / Launching fast
- metrics, monitoring / Understanding and monitoring metrics
- data origin / Understanding where your data comes from
dilated and causal convolution / Dilated and causal convolution
discrete Fourier transform (DFT) / Fast Fourier transformations
disparate sample size / Sources of unfairness in machine learning
dropout
- about / Dropout
dummy variable / One-hot encoding

E

end-to-end (E2E) modeling
- about / E2E modeling
end-to-end models / Heuristic, feature-based, and E2E models
entity embeddings
- about / Entity embeddings
- categories, tokenizing / Tokenizing categories
- input models, creating / Creating input models
- model, training / Training the model
evolutionary strategies (ES) / Evolutionary strategies and genetic algorithms

F

2010 Flash Crash use case / VAEs for time series
fair models
- developing, checklist / A checklist for developing fair models, Is the data biased?
false negatives (FN) / Observational fairness
false positives (FP) / Observational fairness
Fast Fourier transformations / Fast Fourier transformations
feature-based models / Heuristic, feature-based, and E2E models
feature engineering approach
- about / The feature engineering approach
- fraudsters / A feature from intuition – fraudsters don't sleep
- fraudulent transfer destination / Expert insight – transfer, then cash out
- fraudulent cash outs / Expert insight – transfer, then cash out
- balance errors / Statistical quirks – errors in balances
feature scaling, ways
- standardization / Preparing the data for training
- Min-Max rescaling / Preparing the data for training
- mean normalization / Preparing the data for training
- unit length scaling, applying / Preparing the data for training
filters
- applying, on color images / Filters on color images
forecasting, with neural nets
- about / Forecasting with neural networks
- data preparation / Data preparation
- data preparation, weekdays / Weekdays
forward pass / A forward pass
four-fifths rule / Legal perspectives
fraud detection
- SGAN, using / SGANs for fraud detection
frontiers, RL
- about / Frontiers of RL, Understanding the brain through RL
- multi agents / Multi-agent RL
- many agents / Multi-agent RL
- many / Multi-agent RL
function approximators / Approximating functions
functions
- approximating / Approximating functions

G

GANs
- about / GANs
- training process / GANs
- MNIST GAN / A MNIST GAN
- latent vectors / Understanding GAN latent vectors
- training tricks / GAN training tricks
General Data Protection Regulation (GDPR) / Keeping data private and complying with regulations
generative models
- using / Using generative models
genetic algorithms / Evolutionary strategies and genetic algorithms
global features
- about / Visualization and preparation in pandas
- issues / Aggregate global feature statistics
Global Vectors (GloVe) / Loading pretrained word vectors
Google cloud AutoML
- reference / Learning how to learn
graphics processing units (GPUs) / Using the right hardware for your problem
Graphics Processing Units (GPUs) / Setting up your workspace
Gym
- reference / Learning to balance

H

H2O AutoML
- reference / Learning how to learn
heuristic model
- about / Heuristic, feature-based, and E2E models, The heuristic approach
- used, for making predictions / Making predictions using the heuristic model
- F1 score / The F1 score
- evaluating, with confusion matrix / Evaluating with a confusion matrix
hyper-parameters / Gradient descent
hyperas
- used, for searching hyperparameter / Hyperparameter search with Hyperas
- reference / Hyperparameter search with Hyperas
- installation adjustments / Hyperparameter search with Hyperas
Hyperopt
- reference / Learning how to learn
hyperparameter
- searching, with hyperas / Hyperparameter search with Hyperas

I

image datasets
- working with / Working with big image datasets
instrumental variables two-stage least squares (IV2SLS) / Instrument variables
integrated / ARIMA

J

JobLib
- reference / Flat prior

K

Kaggle
- reference / Using Kaggle kernels, The data, An introductory guide to spaCy
Kaggle Kernel demoing marbles
- reference / Unit testing data
Kalman filters / Kalman filters
Keras
- about / A brief introduction to Keras
- importing / Importing Keras
- two-layer model / A two-layer model in Keras
- and TensorFlow / Keras and TensorFlow
- used, for creating predictive models / Creating predictive models with Keras
- building blocks, ConvNets / The building blocks of ConvNets in Keras
- documentation, reference / Augmentation with ImageDataGenerator
Keras functional API
- about / A quick tour of the Keras functional API
Keras library
- data, preparing / Preparing the data for the Keras library
- nominal data / Preparing the data for the Keras library
- ordinal data / Preparing the data for the Keras library
- numerical data / Preparing the data for the Keras library
- one-hot encoding / One-hot encoding
- entity embeddings / Entity embeddings
Keras library
- one-hot encoding / One-hot encoding
kernel_regularizer / Regularization in Keras
Kullback-Leibler (KL) divergence / Visualizing latent spaces with t-SNE

L

Latent Dirichlet Allocation (LDA) / Topic modeling
latent spaces
- visualizing, with t-SNE / Visualizing latent spaces with t-SNE
learning rate / Parameter updates
linear step / A logistic regressor
Local Interpretable Model-Agnostic Explanations (LIME) / Understanding which inputs led to which predictions
logistic regressor
- about / A logistic regressor
- Python version / Python version of our logistic regressor
LSTM
- about / LSTM
- carry / The carry

M

machine learning / What is machine learning?
marbles
- reference / Unit testing data
Markov Chains
- reference / Markov processes and the bellman equation – A more formal introduction to RL
Markov processes / Markov processes and the bellman equation – A more formal introduction to RL
matrix multiplication (matmul) / Tensors and the computational graph
mean absolute percentage (MAPE) / Establishing a training and testing regime
mean stationarity
- about / Different kinds of stationarity
median forecasting / Median forecasting
ML
- unfairnes, sources / Sources of unfairness in machine learning
ML software stack
- Keras / The machine learning software stack
- NumPy / The machine learning software stack
- Pandas / The machine learning software stack
- Scikit-learn / The machine learning software stack
- Matplotlib / The machine learning software stack
- Jupyter / The machine learning software stack
- about / The machine learning software stack
MNIST
- filters / Filters on MNIST
MNIST Autoencoder VAE
- reference / Autoencoder for MNIST
model debugging
- about / Debugging your model
- hyperas, used for searching hyperparameter / Hyperparameter search with Hyperas
- learning rate, searching / Efficient learning rate search
- learning rate, scheduling / Learning rate scheduling
- TensorBoard, used for training monitoring / Monitoring training with TensorBoard
- vanishing gradient problem / Exploding and vanishing gradients
- exploding gradient problem / Exploding and vanishing gradients
model loss
- measuring / Measuring model loss
- gradient descent / Gradient descent
- backpropagation / Backpropagation
- parameter updates / Parameter updates
- 1-layer neural network, training / Putting it all together
model parameters
- optimizing / Optimizing model parameters
models
- training, for maintaining fairness measures / Training to be fair
- interpreting, for ensuring fairness / Interpreting models to ensure fairness
- inspecting, for unfairness / Unfairness as complex system failure, Complex systems run in degraded mode, Accident-free operation requires experience with failure
modularity trade-off / The modularity tradeoff
momentum / Momentum
Moving Average / ARIMA

N

named entity recognition (NER)
- about / Named entity recognition
- fine tuning / Fine-tuning the NER
neural nets
- used, for forecasting / Forecasting with neural networks
- uncertainty / Bayesian deep learning
neural networks (NNs) / Approximating functions
No-U-Turn Sampler (NUTS) / Metropolis-Hastings MCMC
nonlinear causal models / Non-linear causal models
nonlinear step / A logistic regressor
notebook, executing
- TensorFlow, installing / Installing TensorFlow
- Keras, installing / Installing Keras

O

observational fairness
- about / Observational fairness
off the shelf AutoML solutions
- tpot / Learning how to learn
- auto-sklearn / Learning how to learn
- AutoWEKA / Learning how to learn
- H2O AutoML / Learning how to learn
- Google cloud AutoML / Learning how to learn
one-hot encoding / One-hot encoding
overfitting / Creating a test set

P

pandas
- visualization / Visualization and preparation in pandas
- preparation / Visualization and preparation in pandas
- reference / Named entity recognition
part of speech (POS) tagging
- about / Part-of-speech (POS) tagging
performance tips, machine learning applications
- about / Performance tips
- right hardware , using / Using the right hardware for your problem
- distributed training, using with TF estimators / Making use of distributed training with TF estimators
- CuDNNLSTM, using / Using optimized layers such as CuDNNLSTM
- pipeline, optimizing / Optimizing your pipeline
- Cython, used for speeding up code / Speeding up your code with Cython
- cache frequent requests / Caching frequent requests
predictive models, creating
- training data, oversampling / Oversampling the training data
predictive models, creating with Keras
- about / Creating predictive models with Keras
- target, extracting / Extracting the target
- test set, creating / Creating a test set
- building / Building the model
- simple baseline, creating / Creating a simple baseline
- complex models, building / Building more complex models
preexisting social biases / Sources of unfairness in machine learning
pretrained models
- working with / Working with pretrained models
- VGG16, modifying / Modifying VGG-16
- random image augmentation / Random image augmentation
- random image augmentation, with ImageDataGenerator / Augmentation with ImageDataGenerator
pretrained word vectors
- loading / Loading pretrained word vectors
principal component analysis (PCA) / Understanding autoencoders
Python
- regex module, using / Using Python's regex module

Q

Q-function / Catch – a quick guide to reinforcement learning
Q-learning
- about / Catch – a quick guide to reinforcement learning
- used, for conversion of RL into supervised learning / Q-learning turns RL into supervised learning
- exploration / Training to play Catch
Q-learning model
- defining / Defining the Q-learning model
qualitative rationale / The feature engineering approach

R

recurrent dropout / Recurrent dropout
recurrent neural networks (RNN) / Simple RNN
regex module
- using, in Python / Using Python's regex module
- using, in pandas / Regex in pandas
- using / When to use regexes and when not to
Region-based Convolutional Neural Network (R-CNN) / Bounding box prediction
regular expressions
- about / Regular expressions
regularization
- about / Regularization
- L2 regularization / L2 regularization
- L1 regularization / L1 regularization
- in Keras / Regularization in Keras
reinforcement learning
- about / Reinforcement learning
- effectiveness of data / The unreasonable effectiveness of data
- machine learning models / All models are wrong
- conversion, to supervised learning with Q-learning / Q-learning turns RL into supervised learning
reinforcement learning (RL)
- about / Catch – a quick guide to reinforcement learning, Markov processes and the bellman equation – A more formal introduction to RL
- frontiers / Frontiers of RL
reward functions, designing
- manual reward shaping / Careful, manual reward shaping
- inverse reinforcement learning (IRL) / Inverse reinforcement learning
- human preferences, learning / Learning from human preferences
- robost RL, creating / Robust RL
RL engineering
- best practices / Designing good reward functions
- reward functions, designing / Designing good reward functions
rule-based matching
- about / Rule-based matching
- custom functions, adding to matchers / Adding custom functions to matchers
- matchers, adding to pipeline / Adding the matcher to the pipeline
- combining, with learning based systems / Combining rule-based and learning-based systems

S

sampling biases / Sources of unfairness in machine learning
semi-supervised generative adversarial network (SGAN)
- about / Using generative models
- used, for fraud detection / SGANs for fraud detection
- reference / SGANs for fraud detection
semi-supervised learning / Using less data – active learning
Seq2Seq models
- about / Seq2seq models
- architecture overview / Seq2seq architecture overview
- data / The data
- inference models, creating / Creating inference models
- translations, creating / Making translations, Exercises
SHAP (SHapley Additive exPlanation) / Interpreting models to ensure fairness
simple model / What to do if you don't have enough data
simple RNN / Simple RNN
SpaCy
- about / An introductory guide to spaCy, Document similarity with word embeddings
- Doc instance / An introductory guide to spaCy
- Vocab class / An introductory guide to spaCy
Spearmint
- reference / Learning how to learn
stationarity
- types / Different kinds of stationarity
- significance / Why stationarity matters
stationarity issues
- avoiding / When to ignore stationarity issues
Stochastic Gradient Descent (SGD) / Compiling the model
stochastic volatility
- reference / Metropolis-Hastings MCMC
supervised learning / Supervised learning, Using less data – active learning
Synthetic Minority Over-sampling Technique (SMOTE) / Oversampling the training data
systematic error / Sources of unfairness in machine learning

T

t-SNE algorithm
- used, for visualizing latent spaces / Visualizing latent spaces with t-SNE
Tabotea project / The data
tensors
- and computational graph / Tensors and the computational graph
Term Frequency, Inverse Document Frequency (TF-IDF) / TF-IDF
test set / Creating a test set
text classification task
- about / A text classification task
time series
- examining / Examining the sample time series
time series models
- using, with word vectors / Time series models with word vectors
time series stationary
- making / Making a time series stationary
topic modeling / Topic modeling
tpot
- reference / Learning how to learn
training and testing regime
- establishing / Establishing a training and testing regime
transfer learning / What to do if you don't have enough data
tree-based methods
- about / A brief primer on tree-based methods
- decision tree / A simple decision tree
- random forest / A random forest
- XGBoost / XGBoost
Tree of Parzen (TPE) algorithm / Hyperparameter search with Hyperas
true negatives (TN) / Observational fairness
true payoff probability (TPP) / An intuitive guide to Bayesian inference
true positives (TP) / Observational fairness
two-layer model, Keras
- about / A two-layer model in Keras
- layer, stacking / Stacking layers
- model, compiling / Compiling the model
- model, training / Training the model

U

unsupervised learning / Unsupervised learning, Using less data – active learning

V

vanishing gradient problem / Monitoring training with TensorBoard
variance stationarity
- about / Different kinds of stationarity
variational autoencoders (VAEs)
- about / Variational autoencoders
- MNIST example / MNIST example
- Lambda layer, using / Using the Lambda layer
- Kullback-Leibler divergence / Kullback–Leibler divergence
- custom loss, creating / Creating a custom loss
- using, for data generation / Using a VAE to generate data
- used, for end-to-end fraud detection / VAEs for an end-to-end fraud detection system
- using, in time series / VAEs for time series

W

word embeddings
- about / Word embeddings
- document similarity / Document similarity with word embeddings
word vectors
- used, for preprocessing / Preprocessing for training with word vectors
workspace
- setting up / Setting up your workspace, Using Kaggle kernels
- notebooks, local execution / Running notebooks locally

X

Xtreme Gradient Boosting (XGBoost)
- reference / A brief primer on tree-based methods, XGBoost
- about / XGBoost

Y

You Only Look Once (YOLO) / Bounding box prediction

The rest of the chapter is locked

You're reading from Machine Learning for Finance Principles and practice for financial insiders

Table of Contents (15) Chapters

Index

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Authors (2)

Other recommended products

Personalised recommendations for you

You're reading from Machine Learning for Finance Principles and practice for financial insiders

Table of Contents (15) Chapters

Index

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Unlock this book and the full library FREE for 7 days

Authors (2)

Other recommended products

Personalised recommendations for you