Validating Hyperparameters
A neural network uses many hyperparameters, as well as parameters such as weights and biases. The hyperparameters here include the number of neurons in each layer, batch size, the learning rate for updating parameters, and weight decay. Setting the hyperparameters to inappropriate values deteriorates the performance of the model. The values of these hyperparameters are very important, but determining them usually requires a lot of trial and error. This section describes how to search for hyperparameter values as efficiently as possible.
Validation Data
In the dataset we've used so far, the training data and test data are separate. The training data is used to train a network, while the test data is used to evaluate generalization performance. Thus, you can determine whether or not the network conforms too well only to the training data (that is, whether overfitting occurs) and how large the generalization performance is.
We will use various...