Let's simulate the CartPole environment by following these steps:
- To run the CartPole environment, let's first search for its name in the table of environments at https://github.com/openai/gym/wiki/Table-of-environments. We get 'CartPole-v0' and also learn that the observation space is represented in a 4-dimensional array, and that there are two possible actions (which makes sense).
- We import the Gym library and create an instance of the CartPole environment:
>>> import gym
>>> env = gym.make('CartPole-v0')
- Reset the environment:
>>> env.reset()
array([-0.00153354, 0.01961605, -0.03912845, -0.01850426])
As you can see, this also returns the initial state represented by an array of four floats.
- Render the environment:
>>> env.render()
True
You will see a small window popping up, as follows:
- Now, let's make a while loop and let the agent perform as many random actions as it can:
>>> is_done = False
>>> while not is_done:
... action = env.action_space.sample()
... new_state, reward, is_done, info = env.step(action)
... print(new_state)
... env.render()
...
[-0.00114122 -0.17492355 -0.03949854 0.26158095]
True
[-0.00463969 -0.36946006 -0.03426692 0.54154857]
True
……
……
[-0.11973207 -0.41075106 0.19355244 1.11780626]
True
[-0.12794709 -0.21862176 0.21590856 0.89154351]
True
Meanwhile, you will see that the cart and pole are moving. At the end, you will see they both stop. The window looks like the following:
The episode only lasts several steps because the left or right actions are chosen randomly. Can we record the whole process so we can replay it afterward? We can do so with just two lines of code in Gym, as shown in Step 7. If you are using a Mac or Linux system, you need to complete Step 6 first; otherwise, you can jump to Step 7.
- To record video, we need to install the ffmpeg package. For Mac, it can be installed via the following command:
brew install ffmpeg
For Linux, the following command should do it:
sudo apt-get install ffmpeg
- After creating the CartPole instance, add these two lines:
>>> video_dir = './cartpole_video/'
>>> env = gym.wrappers.Monitor(env, video_dir)
This will record what is displayed in the window and store it in the specified directory.
Now re-run the codes from Step 3 to Step 5. After an episode terminates, we can see that an .mp4 file is created in the video_dir folder. The video is quite short; it may last 1 second or so.