In this chapter, we've really started to see just how powerful a deep neural network can be when doing multiclass classification. We covered the softmax function in detail and then we built and trained a network to classify handwritten digits into their 10 respective classes.
Finally, when we noticed that our model was overfitting, we attempted to use both dropout and L2 regularization to reduce the variance of the model.
By now, you've seen that deep neural networks require lots of choices, choices about architecture, learning rate, and even regularization rates. We will spend the next chapter learning how to optimize these choices.