Training an artificial neural network
Now that we have seen an NN in action and have gained a basic understanding of how it works by looking over the code, let’s dig a little bit deeper into some of the concepts, such as the loss computation and the backpropagation algorithm that we implemented to learn the model parameters.
Computing the loss function
As mentioned previously, we used an MSE loss (as in Adaline) to train the multilayer NN as it makes the derivation of the gradients a bit easier to follow. In later chapters, we will discuss other loss functions, such as the multi-category cross-entropy loss (a generalization of the binary logistic regression loss), which is a more common choice for training NN classifiers.
In the previous section, we implemented an MLP for multiclass classification that returns an output vector of t elements that we need to compare to the t×1 dimensional target vector in the one-hot encoding representation. If we predict the...