Implementation of DQN
This chapter will show you how to implement all the components, for example, Q-network, replay memory, trainer, and Q-learning optimizer, of the deep Q-learning algorithm with Python and TensorFlow.
We will implement the QNetwork
class for the Q-network that we discussed in the previous chapter, which is defined as follows:
class QNetwork: def __init__(self, input_shape=(84, 84, 4), n_outputs=4, network_type='cnn', scope='q_network'): self.width = input_shape[0] self.height = input_shape[1] self.channel = input_shape[2] self.n_outputs = n_outputs self.network_type = network_type self.scope = scope # Frame images self.x = tf.placeholder(dtype=tf.float32, shape=(None, self.channel, self.width, self.height)) # Estimates of Q-value self.y = tf.placeholder(dtype=tf.float32, shape=(None,)) ...