Gradients help us to update weights in the right direction and at the right amount. What if these values become too high or too low?
The weights would not be updated correctly, the network would become unstable, and, consequently, our training of the network as a whole would fail.
The problem of vanishing and exploding gradients is seen predominantly in neural networks with a large number of hidden layers. When backpropagating in such neural networks, the error can become too large or too small whenever we compute the gradient, leading to instability in weight updates.
The exploding gradient problem occurs when large error gradients pile up and cause huge updates to the weights in our network. On the other hand, when the values of these gradients are too small, they effectively prevent the weights from getting updated in a network. This is called the vanishing gradient problem. Vanishing gradients can lead to the stopping of training altogether since...