Models
In this example, two architectures of DQN are used: a simple feed-forward network with three layers and a network with 1D convolution as a feature extractor, followed by two fully connected layers to output Q-values. Both of them use the dueling architecture described in Chapter 8. Double DQN and two-step Bellman unrolling have also been used. The rest of the process is the same as in a classical DQN (from Chapter 6). Both models are in Chapter10/lib/models.py and are very simple. Let’s start with the feed-forward model:
class SimpleFFDQN(nn.Module):
def __init__(self, obs_len: int, actions_n: int):
super(SimpleFFDQN, self).__init__()
self.fc_val = nn.Sequential(
nn.Linear(obs_len, 512),
nn.ReLU(),
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 1)
)
self.fc_adv = nn.Sequential(
nn.Linear(obs_len, 512),
...