The way we train these networks is by using backpropagation through time (BPTT). This is an exotic name for a slight variation of something you already know of from Chapter 2, What is a Neural Network and How Do I Train One?. In this section, we will explore this variation in detail.
Training RNNs
Backpropagation through time
With RNNs, we have multiple copies of the same network, one for each timestep. Therefore, we need a way to backpropagate the error derivatives and calculate weight updates for each of the parameters in every timestep. The way we do this is simple. We're following the contours of a function so that we can try and optimize its shape. We have multiple copies of the trainable parameters, one at each...