Most of the DDPG code is the same as we saw earlier in Chapter 5, Deep Deterministic Policy Gradients (DDPG); only the differences will be summarized here.
Training a DDPG agent to learn to drive
Coding ddpg.py
Our state dimension for TORCS is 29 and the action dimension is 3; these are set in ddpg.py as follows:
state_dim = 29
action_dim = 3
action_bound = 1.0
Coding AandC.py
The actor and critic file, AandC.py, also needs to be modified. In particular, the create_actor_network in the ActorNetwork class is edited to have two hidden layers with 400 and 300 neurons, respectively...