The full implementation of the deep Q-learning algorithm can be downloaded from GitHub (link xxx). To train our AI player for Breakout, run the following command under the src folder:
python train.py -g Breakout -d gpu
There are two arguments in train.py. One is -g or --game, indicating the name of the game one wants to test. The other one is -d or --device, which specifies the device (CPU or GPU) one wants to use to train the Q-network.
For Atari games, even with a high-end GPU, it will take 4-7 days to make our AI player achieve human-level performance. In order to test the algorithm quickly, a special game called demo is implemented as a lightweight benchmark. Run the demo via the following:
python train.py -g demo -d cpu
The demo game is based on the GridWorld game on the website at https://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html:
data:image/s3,"s3://crabby-images/b3107/b3107196f32a0b2414f92a3ac52f1f7977b10ed6" alt=""
In this game...