We've designed and built our first deep Q-network to solve the CartPole problem. Deep Q-networks handle the problem of what happens when a state space in an optimization task gets too large to handle with a simple lookup table function.
DQNs are able to model a state space and the Q-function of an agent, allowing the agent to generalize about the environment and predict the values of states it has not yet seen. Keras provides many useful functionalities that let us design and build powerful DQN architectures relatively quickly and easily.
In the next chapter, we'll dive deeper into a particularly interesting problem in RL called the multi-armed bandit problem (MABP), its relevance to scientific research, and the types of problems that are well-suited to this problem-solving framework.