This experiment uses a version of the double-pole balancing problem that assumes full knowledge of the current system state, including the angular velocities of the poles and the velocity of the cart. The criteria of success in this experiment are to keep both poles balanced for 100,000 steps, or approximately 33 minutes of simulated time. The pole is considered balanced when it stays within degrees of vertical, while the cart remains within meters of the track's center.
Double-pole balancing experiment
Hyperparameter selection
Compared to the previous experiment described in this chapter, double-pole balancing is much harder to solve due to its complex motion dynamics. Thus, the search space for a successful control...