Completing the exercises in this section is entirely optional, but, hopefully, you can start to appreciate that we, as reinforcement learners ourselves, learn best by doing. Do your best and attempt to complete at least 2-3 exercises from the following:
- Consider other problems you could use DP with? How would you break the problem up into subproblems and calculate each subproblem?
- Code up another example that compares a problem programmed linearly versus dynamically. Use the example from Exercise 1. The code examples, Chapter_2_2.py and Chapter_2_3.py, are good examples of side-by-side comparisons.
- Look through the OpenAI documentation and explore other RL environments.
- Create, render, and explore other RL environments from Gym using the sample test code from Chapter_2_4.py.
- Explain the process/algorithm of evaluating and improving a policy using DP.
- Explain the difference...