Preventing overfitting in neural networks
A neural network is powerful as it can derive hierarchical features from data with the right architecture (the right number of hidden layers and hidden nodes). It offers a great deal of flexibility and can fit a complex dataset. However, this advantage will become a weakness if the network is not given enough control over the learning process. Specifically, it may lead to overfitting if a network is only good at fitting to the training set but is not able to generalize to unseen data. Hence, preventing overfitting is essential to the success of a neural network model.
There are mainly three ways to impose restrictions on our neural networks: L1/L2 regularization, dropout, and early stopping. We practiced the first method in Chapter 4, Predicting Online Ad Click-Through with Logistic Regression, and will discuss the other two in this section.
Dropout
Dropout means ignoring a certain set of hidden nodes during the learning phase...