Back propagation, which is short for the backward propagation of errors, is an algorithm for supervised learning of neural networks using gradient descent. This calculates what is known as the gradient of the error function, with respect to the network's weights. It is a generalized form of the delta rule for perceptrons all the way to multi-layer feed-forward neural networks.
Unlike forward propagation, back-prop calculates the gradients by moving backwards through the network. The gradient of the final layer of weights is calculated first, and the gradient of the first layer is hence calculated last. With the recent popularity in deep learning for image and speech recognition, back-prop has once again taken the spotlight. It is, for all intents and purposes, an efficient algorithm, and today's version utilizes GPUs to further improve performance.
Lastly, because the computations for back-prop are dependent upon the activations and outputs from the forward phase (non-error term for all layers, including hidden), all of these values must be computed prior to the backwards phase beginning. It is therefore a requirement that the forward phase precede the backward phase for every iteration of gradient descent.