Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Java Deep Learning Cookbook

You're reading from   Java Deep Learning Cookbook Train neural networks for classification, NLP, and reinforcement learning using Deeplearning4j

Arrow left icon
Product type Paperback
Published in Nov 2019
Publisher Packt
ISBN-13 9781788995207
Length 304 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Rahul Raj Rahul Raj
Author Profile Icon Rahul Raj
Rahul Raj
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Introduction to Deep Learning in Java 2. Data Extraction, Transformation, and Loading FREE CHAPTER 3. Building Deep Neural Networks for Binary Classification 4. Building Convolutional Neural Networks 5. Implementing Natural Language Processing 6. Constructing an LSTM Network for Time Series 7. Constructing an LSTM Neural Network for Sequence Classification 8. Performing Anomaly Detection on Unsupervised Data 9. Using RL4J for Reinforcement Learning 10. Developing Applications in a Distributed Environment 11. Applying Transfer Learning to Network Models 12. Benchmarking and Neural Network Optimization 13. Other Books You May Enjoy

Combating overfitting problems

As we know, overfitting is a major challenge that machine learning developers face. It becomes a big challenge when the neural network architecture is complex and training data is huge. While mentioning overfitting, we're not ignoring the chances of underfitting at all. We will keep overfitting and underfitting in the same category. Let's discuss how we can combat overfitting problems.

The following are possible reasons for overfitting, including but not limited to:

  • Too many feature variables compared to the number of data records
  • A complex neural network model

Self-evidently, overfitting reduces the generalization power of the network and the network will fit noise instead of a signal when this happens. In this recipe, we will walk through key steps to prevent overfitting problems.

How to do it...

  1. Use KFoldIterator to perform k-fold cross-validation-based resampling:
KFoldIterator kFoldIterator = new KFoldIterator(k, dataSet);
  1. Construct a simpler neural network architecture.
  2. Use enough train data to train the neural network.

How it works...

In step 1, k is the arbitrary number of choice and dataSet is the dataset object that represents your training data. We perform k-fold cross-validation to optimize the model evaluation process.

Complex neural network architectures can cause the network to tend to memorize patterns. Hence, your neural network will have a hard time generalizing unseen data. For example, it's better and more efficient to have a few hidden layers rather than hundreds of hidden layers. That's the relevance of step 2.

Fairly large training data will encourage the network to learn better and a batch-wise evaluation of test data will increase the generalization power of the network. That's the relevance of step 3. Although there are multiple types of data iterator and various ways to introduce batch size in an iterator in DL4J, the following is a more conventional definition for RecordReaderDataSetIterator:

public RecordReaderDataSetIterator(RecordReader recordReader,
WritableConverter converter,
int batchSize,
int labelIndexFrom,
int labelIndexTo,
int numPossibleLabels,
int maxNumBatches,
boolean regression)

There's more...

When you perform k-fold cross-validation, data is divided into k number of subsets. For every subset, we perform evaluation by keeping one of the subsets for testing and the remaining k-1 subsets for training. We will repeat this k number of times. Effectively, we use the entire data for training with no data loss, as opposed to wasting some of the data on testing.

Underfitting is handled here. However, note that we perform the evaluation k number of times only.

When you perform batch training, the entire dataset is divided as per the batch size. If your dataset has 1,000 records and the batch size is 8, then you have 125 training batches.

You need to note the training-to-testing ratio as well. According to that ratio, every batch will be divided into a training set and testing set. Then the evaluation will be performed accordingly. For 8-fold cross-validation, you evaluate the model 8 times, but for a batch size of 8, you perform 125 model evaluations.

Note the rigorous mode of evaluation here, which will help to improve the generalization power while increasing the chances of underfitting.
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime