Network architectures are not the only things to have improved over the years. The way that networks are trained has also evolved, improving how reliably and quickly they can converge. In this section, we will tackle some of the shortcomings of the gradient descent algorithm we covered in Chapter 1, Computer Vision and Neural Networks, as well as some ways to avoid overfitting.
Refining the training process
Modern network optimizers
Optimizing multidimensional functions, such as neural networks, is a complex task. The gradient descent solution we presented in the first chapter is an elegant solution, though it has some limitations that we will highlight in the following section. Thankfully, researchers have been developing...