In this section, we'll implement an agent that takes random actions and does not keep track of its actions or learn from them. We'll get started on building an actual Q-learning algorithm in Chapter 4, Teaching a Smartcab to Drive Using Q-Learning. For now, all your agent will be able to do is to take random actions.
As part of our analysis, we'll be comparing the success of this randomly-acting agent to the results of an optimized Q-learning agent. The randomly-acting agent is called our baseline agent, and we will use it as a control to which we'll compare the performance of future machine learning models. We'll discuss the significance of baseline models at the end of the chapter.