Neural networks and regularization
Even though we didn't overtrain our model in our last example, it is necessary to think about regularization strategies for neural networks. Three of the most widely-used ways in which we can apply regularization to a neural network are as follows:
L1 and L2 regularization with weight decay as a parameter for the regularization strength
Dropout means that deactivating units within the neural network at random can force other units in the network to take over
Averaging or ensembling multiple neural networks (each with different settings)
Let's try dropout for this model and see if works:
clf = Classifier( layers=[ Layer("Rectifier", units=13), Layer("Rectifier", units=13), Layer("Softmax")], learning_rate=0.01, n_iter=2000, learning_rule='nesterov', regularize...