We invite you to play around with the framework we just implemented, trying different hyperparameters (layer sizes, learning rate, batch size, and so on). Choosing the proper topography (as well as other hyperparameters) can require lots of tweaking and testing. While the sizes of the input and output layers are conditioned by the use case (for example, for classification, the input size would be the number of pixel values in the images, and the output size would be the number of classes to predict from), the hidden layers should be carefully engineered.
For instance, if the network has too few layers, or the layers are too small, the accuracy may stagnate. This means the network is underfitting, that is, it does not have enough parameters for the complexity of the task. In this case, the only solution is to adopt a new architecture that is more suited to the application.
On the other hand, if the network is too complex...