We will now train the preceding DDPG code on Pendulum-v0. To train the DDPG agent, simply type the following in the command line at the same level as the rest of the code:
python ddpg.py
This will start the training:
{'actor_lr': 0.0001,
'buffer_size': 1000000,
'critic_lr': 0.001,
'env': 'Pendulum-v0',
'gamma': 0.99,
'max_episode_len': 1000,
'max_episodes': 250,
'minibatch_size': 64,
'mode': 'train',
'random_seed': 258,
'render_env': False,
'tau': 0.001}
.
.
.
2019-03-03 17:23:10.529725: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 384.130.0
| Episode: 0 | Reward: -7981 | Qmax: -6.4859
| Episode: 1 | Reward: -7466 | Qmax: -10.1758
| Episode: 2 | Reward...