We learned a lot in this chapter. More importantly, we implemented an agent that learned to solve the Mountain Car problem smartly in 7 minutes or so!
We started by understanding the famous Mountain Car problem and looking at how the environment, the observation space, the state space, and rewards are designed in the Gym's MountainCar-v0 environment. We revisited the reinforcement learning Gym boilerplate code we used in the previous chapter and made some improvements to it, which are also available in the code repository of this book.
We then defined the hyperparameters for our Q-learning agent and started implementing a Q-learning algorithm from scratch. We first implemented the agent's initialization function to initialize the agent's internal state variables, including the Q value representation, using a NumPy n-dimensional array. Then, we implemented...