The basic RNN cell
Traditional multilayer perceptron neural networks make the assumption that all inputs are independent of each other. This assumption is not true for many types of sequence data. For example, words in a sentence, musical notes in a composition, stock prices over time, or even molecules in a compound, are examples of sequences where an element will display a dependence on previous elements.
RNN cells incorporate this dependence by having a hidden state, or memory, that holds the essence of what has been seen so far. The value of the hidden state at any point in time is a function of the value of the hidden state at the previous time step, and the value of the input at the current time step, that is:
Here, ht and ht-1 are the values of the hidden states at the time t and t-1 respectively, and xt is the value of the input at time t. Notice that the equation is recursive, that is, ht-1 can be represented in terms of ht-2 and xt-1, and so on, until the beginning...