Introduction
Due to the drawbacks of recurrent neural networks (RNNs) when it comes to backpropagation, Long Short-Term Memory Units (LSTMs) and Gated Recurrent Units (GRUs) have been gaining popularity in recent times when it comes to learning sequential input data as they are better suited to tackle problems of vanishing and exploding gradients.