In Chapter 2, Think Like a Machine, the system of McCulloch-Pitts neurons generated a vector with a one-hot function in the following process.
R, the reward vector, represents the input of the reinforcement learning program and needs to be measured.
This chapter deals with an approach designed to build a reward matrix based on the company data. It relies on the data, weights, and biases provided. When deep learning forward feedback neural networks based on perception are introduced (Chapter 4, Become an Unconventional Innovator), a system cannot be content with a training set. Systems have a natural tendency to learn training sets through backpropagation. In this case, one set of company data is not enough.
In real-life company projects, a system will not be validated until tens of thousands of results have been produced. In some cases, a...