Chapter 1, Analyzing Insurance Severity Claims, shows how to develop a predictive model for analyzing insurance severity claims using some widely used regression techniques. We will demonstrate how to deploy this model in a production-ready environment.
Chapter 2, Analyzing and Predicting Telecommunication Churn, uses the Orange Telecoms Churn dataset, consisting of cleaned customer activity and churn labels specifying whether customers canceled their subscription or not, to develop a real-life predictive model.
Chapter 3, High-Frequency Bitcoin Price Prediction from Historical and Live Data, shows how to develop a real-life project that collects historical and live data. We predict the Bitcoin price for the upcoming weeks, months, and so on. In addition, we demonstrate how to generate a simple signal for online trading in Bitcoin. Finally, this chapter wraps up the whole application as a web app using the Scala Play Framework.
Chapter 4, Population-Scale Clustering and Ethnicity Prediction, uses genomic variation data from the 1,000 Genome Project to apply the K-means clustering approach to scalable genomic data analysis. This is aimed at clustering genotypic variants at the population scale. Finally, we train deep neural network and random forest models to predict ethnicity.
Chapter 5, Topic Modeling in NLP – A Better Insight into Large-Scale Texts, shows how to develop a topic modeling application by utilizing the Spark-based LDA algorithm and Stanford NLP to handle large-scale raw texts.
Chapter 6, Developing Model-Based Movie Recommendation Engines, shows how to develop a scalable movie recommendation engine by inter-operating between singular value decomposition, ALS, and matrix factorization. The movie lens dataset will be used for this end-to-end project.
Chapter 7, Options Trading using Q-Learning and the Scala Play Framework, applies a reinforcement QLearning algorithm on real-life IBM stock datasets and designs a machine learning system driven by criticisms and rewards. The goal is to develop a real-life application called options trading. The chapter wraps up the whole application as a web app using the Scala Play Framework.
Chapter 8, Clients Subscription Assessment for Bank Telemarketing using Deep Neural Networks , is an end-to-end project that shows how to solve a real-life problem called client subscription assessment. An H2O deep neural network will be trained using a bank telemarketing dataset. Finally, the chapter evaluates the performance of this predictive model.
Chapter 9, Fraud Analytics using Autoencoders and Anomaly Detection, uses autoencoders and the anomaly detection technique for fraud analytics. The dataset used is a fraud detection dataset collected and analyzed during a research collaboration by Worldline and the Machine Learning Group of ULB (Université Libre de Bruxelles).
Chapter 10, Human Activity Recognition using Recurrent Neural Networks, includes another end-to-end project that shows how to use an RNN implementation called LSTM for human activity recognition using a smartphone sensor dataset.
Chapter 11, Image Classification using Convolutional Neural Networks, demonstrates how to develop predictive analytics applications such as image classification, using convolutional neural networks on a real image dataset called Yelp.