Summary
In this chapter, you learned about LSTM networks. First, we discussed what an LSTM is and its high-level architecture. We also delved into the detailed computations that take place in an LSTM and discussed the computations through an example.
We saw that an LSTM is composed mainly of five different things:
- Cell state: The internal cell state of an LSTM cell
- Hidden state: The external hidden state used to calculate predictions
- Input gate: This determines how much of the current input is read into the cell state
- Forget gate: This determines how much of the previous cell state is sent into the current cell state
- Output gate: This determines how much of the cell state is output into the hidden state
Having such a complex structure allows LSTMs to capture both short-term and long-term dependencies quite well.
We compared LSTMs to vanilla RNNs and saw that LSTMs are actually capable of learning long-term dependencies as an inherent...