Before understanding the implementation of the training loop, let's take a closer look at how we can generate batches of data.
It is common knowledge that batches are used in neural networks to speed up the training of the model and to consume less memory. Batches are samples of the original dataset that are used for a forward and backward pass to the network. The forward pass refers to the process of multiplying inputs with weights of different layers in the network and obtaining the final output. The backward pass, on the other hand, refers to the process of updating the weights in the neural network based on the loss obtained from the outputs of the forward pass.
In this model, since we are predicting the next set of words given a set of previous words to generate the TV script, the targets are basically the next few (depending on sequence length...