Training an agent to walk using TRPO
In this section, let's learn how to train the agent to walk using Trust Region Policy Optimization (TRPO). Let's use the MuJoCo environment for training the agent. MuJoCo stands for Multi-Joint dynamics with Contact and is one of the most popular simulators used for training agents to perform continuous control tasks.
Note that MuJoCo is a proprietary physics engine, so we need to acquire a license to use it. Also, MuJoCo offers a free 30-day trial period. Installing MuJoCo requires a specific set of steps. So, in the next section, we will see how to install the MuJoCo environment.
Installing the MuJoCo environment
First, in your home directory, create a new hidden folder called .mujoco
. Next, go to the MuJoCo website (https://www.roboti.us/) and download MuJoCo according to your operating system. As shown in Figure 16.6, MuJoCo provides support for Windows, Linux, and macOS:
Figure 16.6: Different MuJoCo versions...