Packt+ | Advance your knowledge in tech

You're reading from Scala Machine Learning Projects Build real-world machine learning and deep learning projects with Scala

Product type Paperback

Published in Jan 2018

Publisher Packt

ISBN-13 9781788479042

Length 470 pages

Edition 1st Edition

Languages

Scala

Tools

Apache Spark

Concepts

Deep Learning

Author (1):

Md. Rezaul Karim

View More author details

Table of Contents (13) Chapters

Preface

1. Analyzing Insurance Severity Claims

2. Analyzing and Predicting Telecommunication Churn FREE CHAPTER

3. High Frequency Bitcoin Price Prediction from Historical and Live Data

4. Population-Scale Clustering and Ethnicity Prediction

5. Topic Modeling - A Better Insight into Large-Scale Texts

6. Developing Model-based Movie Recommendation Engines

7. Options Trading Using Q-learning and Scala Play Framework

8. Clients Subscription Assessment for Bank Telemarketing using Deep Neural Networks

9. Fraud Analytics Using Autoencoders and Anomaly Detection

10. Human Activity Recognition using Recurrent Neural Networks

11. Image Classification using Convolutional Neural Networks

12. Other Books You May Enjoy

Leave a review - let other readers know what you think

Hyperparameter tuning and feature selection

Here are some ways of improving the accuracy by tuning hyperparameters, such as the number of hidden layers, the neurons in each hidden layer, the number of epochs, and the activation function. The current implementation of the H2O-based deep learning model supports the following activation functions:

ExpRectifier
ExpRectifierWithDropout
Maxout
MaxoutWithDropout
Rectifier
RectifierWthDropout
Tanh
TanhWithDropout

Apart from the Tanh one, I have not tried other activation functions for this project. However, you should definitely try.

One of the biggest advantages of using H2O-based deep learning algorithms is that we can take the relative variable/feature importance. In previous chapters, we have seen that, using the random forest algorithm in Spark, it is also possible to compute the variable importance. So, the idea is that if your model does not perform well, it would be worth dropping less important features and doing the training again.

Let's see an example...