Summary
This chapter described some important techniques that are used for neural network training. How to update parameters, how to specify initial weight values, batch normalization, and dropout are all essential techniques that are used in modern neural networks. The techniques described here are often used in state-of-the-art deep learning. In this chapter, we learned about the following:
- Four famous methods for updating parameters: Momentum, AdaGrad, Adam, and SGD.
- How to specify initial weight values, which is very important if we wish to train correctly.
- The Xavier initializer and He initializer, which are effective as initial weight values.
- Batch normalization accelerates training and provides robustness to the initial weight values.
- Weight decay and dropout are regularization techniques that are used to reduce overfitting.
- To search for good hyperparameters, gradually narrowing down the range where appropriate values exist is an efficient...