Packt+ | Advance your knowledge in tech

You're reading from Learn Unity ML-Agents ??? Fundamentals of Unity Machine Learning Incorporate new powerful ML algorithms such as Deep Reinforcement Learning for games

Product type Paperback

Published in Jun 2018

Publisher Packt

ISBN-13 9781789138139

Length 204 pages

Edition 1st Edition

Languages

Tools

Deep Reinforcement Learning

Concepts

Deep Reinforcement Learning

Author (1):

Micheal Lanham

View More author details

As you witnessed, the scene is currently set for Player control, but obviously we want to see how some of this ML-Agents stuff works. In order to do that, we need to change the Brain type the agent is using. Follow along to switch the Brain type in the 3D Ball agent:

Locate the Ball3DAcademy object in the Hierarchy window and expand it to reveal the Ball3DBrain object.
Select the Ball3DBrain object and then look to the Inspector window, as shown in the following screenshot:

Switching the Brain on the Ball3DBrain object

Switch the Brain component, as shown in the preceding excerpt, to the Heuristic setting. The Heuristic brain setting is for ML-Agents that are internally coded within Unity scripts in a heuristic manner. Heuristic programming is nothing more than selecting a simpler quicker solution when a classic, in our case, ML algorithms, may take longer. Writing a Heuristic brain can often help you better define a problem and it is a technique we will use later in this chapter. The majority of current game AIs fall within the category of using Heuristic algorithms.

Press Play to run the scene. Now, you will see the platforms balancing each of the balls – very impressive for a heuristic algorithm. Next, we want to open the script with the heuristic brain and take a look at some of the code.

You may need to adjust the Rotation Speed property, up or down, on the Ball 3D Decision (Script). Try a value of .5 for a rotation speed if the Heuristics brain seems unable to effectively balance the balls. The Rotation Speed is hidden in the preceding screen excerpt.

Click the Gear icon beside the Ball 3D Decision (Script), and from the context menu, select Edit Script, as shown in the following screenshot:

Editing the Ball 3D Decision script

Take a look at the Decide method in the script as follows:

      public float[] Decide(
              List<float> vectorObs,
              List<Texture2D> visualObs,
              float reward,
              bool done,
              List<float> memory)
          {
              if 
              (gameObject.GetComponent<Brain()
              .brainParameters.vectorActionSpaceType
                 == SpaceType.continuous)
              {
                  List<float> act = new List<float>();

        // state[5] is the velocity of the ball in the x orientation. 
        // We use this number to control the Platform's z axis rotation 
         speed, 
        // so that the Platform is tilted in the x orientation 
        correspondingly. 
          act.Add(vectorObs[5] * rotationSpeed);

        // state[7] is the velocity of the ball in the z orientation. 
        // We use this number to control the Platform's x axis rotation 
        speed, 
        // so that the Platform is tilted in the z orientation 
        correspondingly. 
          act.Add(-vectorObs[7] * rotationSpeed);

          return act.ToArray();
          }

          // If the vector action space type is discrete, then we don't do 
          anything. 
          return new float[1] { 1f };
          }

We will cover more details about what the inputs and outputs of this method mean later. For now though, look at how simple the code is. This is the heuristic brain that is balancing the balls on the platform, which is fairly impressive when you see the code. The question that may just hit you is: why are we bothering with ML programming, then? The simple answer is that the 3D ball problem is deceptively simple and can be easily modeled with eight states. Take a look at the code again and you can see that only eight states are used (0 to 7), with each state representing the direction the ball is moving in. As you can see, this works well for this problem but when we get to more complex examples, we may have millions upon billions of states – hardly anything we could easily solve using heuristic methods.

Heuristic brains should not be confused with Internal brains, which we will get to in Chapter 6, Terrarium Revisited – Building a Multi-Agent Ecosystem. While you could replace the heuristic code in the 3D ball example with an ML algorithm, that is not the best practice for running an advanced ML such as Deep Learning algorithms, which we will discover in Chapter 3, Deep Reinforcement Learning with Python.

In the next section, we are going to modify the Basic example in order to get a better feel for how ML-Agents components work together.