Dropout
Another important technique that can be applied after a pooling layer, but can also generally be applied to a fully connected layer, is to "drop" some neurons and their corresponding input and output connections randomly and periodically. In a dropout layer we specify a probability p for neurons to "drop out" stochastically. During each training period, each neuron has probability p to be dropped out from the network, and a probability (1-p) to be kept. This is to ensure that no neuron ends up relying too much on other neurons, and each neuron "learns" something useful for the network. This has two advantages: it speeds up the training, since we train a smaller network each time, and also helps in preventing over-fitting (see N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: A Simple Way to Prevent Neural Networks from Overfitting, in Journal of Machine Learning Research 15 (2014), 1929-1958, http://www.jmlr.org/papers/volume15/srivastava14a.old...