Exploring dynamic programming
Dynamic programming is a branch of mathematical optimization that proposes optimal solution methods to MDPs. Although most real-world problems are too complex to optimally solve via DP methods, the ideas behind these algorithms are central to many RL approaches. So, it is important to have a solid understanding of them. Throughout this chapter, we go from these exact methods to more practical approaches by systematically introducing approximations.
We start this section by describing an example that will serve as a use case for the algorithms that we will introduce throughout the chapter. Then, we will cover how to do prediction and control using DP. Let's get started!
Example use case: Inventory replenishment of a food truck
Our use case involves a food truck business that needs to decide how many burger patties to buy every weekday to replenish its inventory. Inventory planning is an important class of problems in retail and manufacturing...