Building a Reward Matrix – Designing Your Datasets
Experimenting and implementation comprise the two main approaches of artificial intelligence. Experimenting largely entails trying ready-to-use datasets and black box, ready-to-use Python examples. Implementation involves preparing a dataset, developing preprocessing algorithms, and then choosing a model, the proper parameters, and hyperparameters.
Implementation usually involves white box work that entails knowing exactly how an algorithm works and even being able to modify it.
In Chapter 1, Getting Started with Next-Generation Artifcial Intelligence through Reinforcement Learning, the MDP-driven Bellman equation relied on a reward matrix. In this chapter, we will get our hands dirty in a white box process to create that reward matrix.
An MDP process cannot run without a reward matrix. The reward matrix determines whether it is possible to go from one cell to another, from A to B. It is like a map of a city that...