Let's put our newly gained knowledge to the test. Answer the following questions:
- How does LSTM solve the vanishing gradient problem of RNN?
- What are all the different gates and their functions in an LSTM cell?
- What is the use of the cell state?
- What is a GRU?
- How do bidirectional RNNs work?
- How do deep RNNs compute the hidden state?
- What are encoders and decoders in the seq2seq architecture?
- What is the use of the attention mechanism?