As always, try at least one or two of the following exercises on your own for your own enjoyment and learning:
- Open the BananaCollectors example Banana scene and run it in training mode.
- Modify the BananaCollectors | Banana scene so that it uses five separate learning brains and then run it in training mode.
- Modify the reward functions in the last SoccerTwos exercise to use exponential or logarithmic functions.
- Modify the reward function in the last SoccerTwos exercise to use non-inverse related and non-linear functions. This way, the mean modifying the positive and negative rewards is different for each personality.
- Modify the SoccerTwos scene with different characters and personalities. Model new rewards functions as well, and then train the agents.
- Modify the BananaCollectors example Banana scene to use the same personalities and custom reward functions as we did...