The goal of neural network training is to adjust the hidden and output layer parameters to best predict new data based on training samples. Backpropagation, often simply called backprop, ensures that the information about the performance of the current parameter values gleaned from the evaluation of the cost function for one or several samples flows back to parameters and facilitates optimal updates.
Backpropagation refers to the computation of the gradient of the function that relates the internal parameters that we wish to update to the cost function. The gradient is useful because it indicates the direction of parameter change, which causes the maximal increase in the cost function. Hence, adjusting the parameters in the direction of the negative gradient should produce an optimal cost reduction for the observed samples, as we saw in Chapter 6...