We've set up the OpenAI Gym environment and have started exploring the basic functionality of the package. We've become familiar with the environment and now know how to put in place a learning agent that can take random actions in the environment that may or may not find an optimal solution to the problem, or may take an unreasonably long time to find it.
In the next chapter, we'll see how to implement Q-learning for this agent to make it reach a solution faster and more efficiently, and observe the effect on a Q-learning agent's performance the longer it runs and collects data over time.