In this chapter, we explored a world of possibilities with multi-agent training environments. We first looked at how we could set up environments using self-play, where a single brain may control multiple brains that both compete and cooperate with one another. Then we looked at how we could add personality with intrinsic rewards in the form of curiosity using the ML-Agents curiosity learning system. Next, we looked at how extrinsic rewards could be used to model an agent's personality and influence training. We did this by adding a free asset for style and then applied custom extrinsic rewards through reward function chaining. Finally, we trained the environment and were entertained by the results of the boy agent solidly thrashing the zombie; you will see this if you watch the training to completion.
In the next chapter, we will look at another novel application...