Introduction to the Attention Mechanism and DARQN
In the previous section, we saw how adding an RNN model to a DQN helped to increase its performance. RNNs are known for handling sequential data such as temporal information. In our case, we used a combination of CNNs and RNNs to help our reinforcement learning agent to better understand sequences of images from the game.
However, RNN models do have some limitations when it comes to analyzing long sequences of input or output data. To overcome this situation, researchers have come up with a technique called attention, which is the principal technique behind a Deep Attention Recurrent Q-Network (DARQN). The DARQN model is the same as the DRQN model, with just an attention mechanism added to it. To better understand this concept, we will go through an example of its application: neural translation. Neural translation is the field of translating text from one language to another, such as translating Shakespeare's plays, which...