When creating a neural network model so that we can carry out a given machine learning task, one crucial design decision that needs to be made is the configuration of the network architecture. In the case of the Multilayer Perceptron, the number of nodes in the input and output layers is determined by the characteristics of the problem at hand. Therefore, the choices to be made are about the hidden layers – how many layers, and how many nodes in each layer. Some rules of thumb can be employed for making these decisions, but in many cases, identifying the best choices can turn into a cumbersome trial-and-error process.
One way to handle network architecture parameters is to consider them as hyperparameters of the model since they need to be determined before training is done and affect the training's results...