The Drawback of Simple RNNs
Let's take a look at a simple example in order to revisit the concept of vanishing gradients.
Essentially, you wish to generate an English poem using an RNN. Here, you set up a simple RNN to do your bidding and it ends up producing the following sentence:
"The flowers, despite it being autumn, blooms like a star".
One can easily spot the grammatical error here. The word 'blooms' should be 'bloom' since at the beginning of the sentence, the word 'flowers' indicates that you should be using the plural form of the word 'bloom' to bring about the subject-verb agreement in the sentence. A simple RNN fails at this job because it is incapable of retaining any information about a dependency between the word 'flowers' that occurs early in the sentence and the word 'blooms,' which occurs much later (theoretically, it should be able to!).
A GRU helps to solve this issue by eliminating the 'vanishing...