This chapter discussed different implementations of neural networks, namely, wide, deep, and sparse implementations. After reading this chapter, you should appreciate the differences in design and how they may affect performance or training time. At this point, you should be able to appreciate the simplicity of these architectures and how they present new alternatives to other things we've discussed so far. In this chapter, you also learned to optimize the hyperparameters of your models, for example, the dropout rates, aiming to maximize the generalization ability of the network.
I am sure you noticed that these models achieved accuracies beyond random chance, that is, > 50%; however, the problem we discussed is a very difficult problem to solve, and you might not be surprised that a general neural architecture, like the ones we studied here, does not perform extraordinarily well. In order to achieve better performance, we can use a more specialized type of architecture...