Introducing RNNs
An RNN is a type of NN that can process sequential data with variable length. Examples of such data include text sequences or the price of a stock at various moments in time. By using the word sequential, we imply that the sequence elements are related to each other and their order matters. For example, if we take a book and randomly shuffle all the words in it, the text will lose its meaning, even though we’ll still know the individual words.
RNNs get their name because they apply the same function over a sequence recurrently. We can define an RNN as a recurrence relation:
Here, f is a differentiable function, is a vector of values called internal RNN state (at step t), and is the network input at step t. Unlike regular NNs, where the state only depends on the current input (and RNN weights), here, is a function of both the current input, as well as the previous state, . You can think of as the RNN’s summary of all previous inputs. The...