Data preparation
Careful readers may notice that a suffix, v0, follows each game name, and come up with the following questions: What is the meaning of v0?Is it allowable to replace it with v1 or v2? Actually, this suffix has a relationship with the data preprocessing step for the screen images (observations) extracted from the Atari environment.
There are three modes for each game, for example, Breakout, BreakoutDeterministic, and BreakoutNoFrameskip, and each mode has two versions, for example, Breakout-v0 and Breakout-v4. The main difference between the three modes is the value of the frameskip parameter in the Atari environment. This parameter indicates the number of frames (steps) the one action is repeated on. This is called the frame-skipping technique, which allows us to play more games without significantly increasing the runtime.
For Breakout, frameskip is randomly sampled from 2 to 5. The following screenshots show the frame images returned by the step
function when the action LEFT...