This project was to build a deep reinforcement learning model to successfully play the game of CartPole-v1 from OpenAI Gym. The use case of this chapter is to build a reinforcement learning model on a simple game environment and then extend it to other complex games such as Atari.
In the first half of this chapter, we built a deep Q-learning model to play the CartPole game. The DQN model during testing scored an average of 277.88 points over 100 games.
In the second half of this chapter, we built a deep SARSA learning model (using the same epsilon-greedy policy as Q-learning) to play the CartPole game. The SARSA model during testing scored an average of 365.67 points over 100 games.
Now, let's follow the same technique we have been following in the previous chapters for evaluating the performance of the models from the restaurant chain point...