One of the main problems in RNNs is that the gradient vanishes pretty quickly with an increase in the time steps. There are some architectures that help alleviate this problem, and the most common one is Long Short-Term Memory (LSTM).
A very common type of RNN is LSTM. This type of network is much better at capturing long-term dependencies than simple RNNs. The only unusual thing about LSTMs is the way that they compute the hidden state.
Essentially, an LSTM is composed of a cell, an Input Gate, an Output Gate, and a Forget Gate, which is the unusual thing about it, as shown in the following diagram: