- TORCS is a continuous control problem. DQN works only for discrete actions, and so it cannot be used in TORCS.
- The initialization is another initialization strategy; you can also use a random uniform initialization with the min and max values of the range specified; another approach is to sample from a Gaussian with a zero mean and a specified sigma value. The interested reader must try these different initializers and compare the agent's performance.
- The abs() function is used in the reward function, as we penalize lateral drift from the center equally on either side (left or right). The first term is the longitudinal speed, and so no abs() function is required.
- The Gaussian noise added to the actions for exploration can be tapered down with episode count, and this can result in smoother driving. Surely, there are many other tricks you can do!
- DDPG is off-policy...
United States
United Kingdom
India
Germany
France
Canada
Russia
Spain
Brazil
Australia
Argentina
Austria
Belgium
Bulgaria
Chile
Colombia
Cyprus
Czechia
Denmark
Ecuador
Egypt
Estonia
Finland
Greece
Hungary
Indonesia
Ireland
Italy
Japan
Latvia
Lithuania
Luxembourg
Malaysia
Malta
Mexico
Netherlands
New Zealand
Norway
Philippines
Poland
Portugal
Romania
Singapore
Slovakia
Slovenia
South Africa
South Korea
Sweden
Switzerland
Taiwan
Thailand
Turkey
Ukraine