As always, use the exercises in this section to get a better understanding of the material you learn. Try to work through at least two or three exercises in this section:
- Return to the example Chapter_5_1.py and change the alpha (learning_rate) variable and see what effect this has on the values calculated.
- Return to the example Chapter_5_2.py and alter the arm positions on the various bandits.
- Change the learning rate on the example Chapter_5_2.py and see what effect this has on the Q results output.
- Alter the gamma reward discount factor in the Chapter_5_3.py example, and see what effect this has on agent training.
- Change the exploration epsilon in the Chapter_5_3.py to different values and rerun the sample. See what effect altering the various exploration parameters has on training the agent.
- Alter the various parameters (exploration, alpha, and gamma) in the...