Implementing the Deep Recurrent Q-Learning algorithm and DRQN agent
DRQN uses a recurrent neural network to learn the Q-value function. DRQN is more suited for reinforcement learning in environments with partial observability. The recurrent network layers in the DRQN allow the agent to learn by integrating information from a temporal sequence of observations. For example, DRQN agents can infer the velocity of moving objects in the environment without any changes to their inputs (for example, no frame stacking is required). By the end of this recipe, you will have a complete DRQN agent ready to be trained in an RL environment of your choice.
Getting ready
To complete this recipe, you will first need to activate the tf2rl-cookbook
Conda Python virtual environment and pip install -r requirements.txt
. If the following import statements run without issues, you are ready to get started!
import tensorflow as tf from datetime import datetime import os from tensorflow.keras.layers...