Learning a world model
In the introduction to this chapter, we reminded you how we departed from dynamic programming methods to avoid assuming that the model of the environment an agent is in is available and accessible. Now, coming back to talking about models, we need to also discuss how a world model can be learned when not available. In particular, in this section, we discuss what we aim to learn as a model, when we may want to learn it, a general procedure for learning a model, how to improve it by incorporating the model uncertainty into the learning procedure, and what to do when we have complex observations. Let's dive in!
Understanding what model means
From what we have done so far, a model of the environment could be equivalent to the simulation of the environment in your mind. On the other hand, model-based methods don't require the full fidelity of a simulation. Instead, what we expect to get from a model is the next state given the current state and action...