Benchmarking and evaluation are core to the success of any deep learning exploration. We will develop some simple code to evaluate two key performance measures: the accuracy and the training time. We will use the following model template:
This model is the most common and basic linear template for solving MNIST. You can see we initialize each layer, in theinit method, by creating a class variable that is assigned to a PyTorch nn object. Here, we initialize two linear functions and a ReLU function. The nn.Linear function takes an input size of 28*28 or 784. This is the size of each of the training images. The output channels or the width of the network are set to 100. This can be set to anything, and in general a higher number will give better performance within the constraints of computing resources and the tendency for wider networks to overfit training data...