You're reading from Machine Learning for Algorithmic Trading Predictive models to extract signals from market and alternative data for systematic trading strategies with Python

Product type Paperback

Published in Jul 2020

Publisher Packt

ISBN-13 9781839217715

Length 820 pages

Edition 2nd Edition

Languages

Python

Tools

TensorFlow

Concepts

Machine Learning

Author (1):

Stefan Jansen

View More author details

Table of Contents (27) Chapters

Preface

1. Machine Learning for Trading – From Idea to Execution

2. Market and Fundamental Data – Sources and Techniques FREE CHAPTER

3. Alternative Data for Finance – Categories and Use Cases

4. Financial Feature Engineering – How to Research Alpha Factors

5. Portfolio Optimization and Performance Evaluation

6. The Machine Learning Process

7. Linear Models – From Risk Factors to Return Forecasts

8. The ML4T Workflow – From Model to Strategy Backtesting

9. Time-Series Models for Volatility Forecasts and Statistical Arbitrage

10. Bayesian ML – Dynamic Sharpe Ratios and Pairs Trading

11. Random Forests – A Long-Short Strategy for Japanese Stocks

12. Boosting Your Trading Strategy

13. Data-Driven Risk Factors and Asset Allocation with Unsupervised Learning

14. Text Data for Trading – Sentiment Analysis

15. Topic Modeling – Summarizing Financial News

16. Word Embeddings for Earnings Calls and SEC Filings

17. Deep Learning for Trading

18. CNNs for Financial Time Series and Satellite Images

19. RNNs for Multivariate Time Series and Sentiment Analysis

20. Autoencoders for Conditional Risk Factors and Asset Pricing

21. Generative Adversarial Networks for Synthetic Time-Series Data

22. Deep Reinforcement Learning – Building a Trading Agent

23. Conclusions and Next Steps

24. References

25. Index

Appendix: Alpha Factor Library

Q-learning – finding an optimal policy on the go

Q-learning was an early RL breakthrough when developed by Chris Watkins for his PhD thesis (http://www.cs.rhul.ac.uk/~chrisw/new_thesis.pdf) (1989). It introduces incremental dynamic programming to learn to control an MDP without knowing or modeling the transition and reward matrices that we used for value and policy iteration in the previous section. A convergence proof followed 3 years later (Christopher J. C. H. Watkins and Dayan 1992).

Q-learning directly optimizes the action-value function q to approximate q*. The learning proceeds "off-policy," that is, the algorithm does not need to select actions based on the policy implied by the value function alone. However, convergence requires that all state-action pairs continue to be updated throughout the training process. A straightforward way to ensure this is through an -greedy policy.