When do you need AI for your robot?
We generally describe AI as a technique for modeling or simulating processes that emulate how our brains make decisions. Let’s discuss how AI can be used in robotics to provide capabilities that may be difficult for traditional programming techniques to achieve. One of those is identifying objects in images or pictures. If you connect a camera to a computer, the computer receives not an image, but an array of numbers that represent pixels (picture elements). If we are trying to determine whether a certain object, say a toy, is located in the image, then this can be quite tricky. You can find shapes, such as circles or squares, but a teddy bear? Moreover, what if the teddy bear is upside down, or lying flat on a surface? This is the sort of problem that an AI program can solve when nothing else can.
Our traditional approach for creating robot behaviors is to figure out what function we want and to write code to make that happen. When we have a simple function, such as driving around an obstacle, then this approach works well, and we can get results with a little tuning.
Some examples of AI and ML for robotics include:
- NLP: Using AI/ML to allow the robot to understand and respond to natural human speech and commands. This makes interacting with the robot much more intuitive.
- Computer vision: Using AI to let the robot see and recognize objects or people’s faces, read text, and so on. This helps the robot operate in real-world environments.
- Motion planning: AI can help the robot plan optimal paths and motions to navigate around obstacles and people. This makes the robot’s movements more efficient and human-like.
- Reinforcement learning: The robot can learn how to do, and improve at doing, tasks through trial and error using AI reinforcement learning algorithms. This means less explicit programming is needed.
The main rule of thumb is to use AI/ML whenever you want the robot to perform robustly in a complex, dynamic real-world environment. The AI gives it more perceptual and decision-making capabilities.
Now let’s look at one function we need for this robot – recognizing that an object is either a toy (and needs to be picked up) or is not. Creating a standard function for this via programming is quite difficult. Regular computer vision processes separate an image into shapes, colors, or areas. Our problem is the toys don’t have predictable shapes (circles, squares, or triangles), they don’t have consistent colors, and they are not all the same size. What we would rather do is to teach the robot what is a toy and what is not. That is what we would do with a person. We just need a process for teaching the robot how to use a camera to recognize a particular object. Fortunately, this is an area of AI that has been deeply studied, and there are already techniques to accomplish this, which we will use in Chapter 4. We will use a convolutional neural network (CNN) to recognize toys from camera images. This is a type of supervised learning, where we use examples to show the software what type of object we want to recognize, and then create a customized function that predicts the class (or type) of object based on the pixels that represent it in an image. One of the principles of AI that we will be applying is gradual learning using gradient descent. This means that instead of trying to make the computer learn a skill all in one go, we will train it a little bit at a time, gently training a function to output what we want by looking at errors (or loss) and making small changes. We use the principle of gradient descent – looking at the slope of the change in errors – to determine which way to adjust the training.
You may be thinking at this point, “If that works for learning to classify pictures, then maybe it can be used to classify other things," and you would be right. We’ll use a similar approach – with somewhat different neural networks – to teach the robot to answer to its name, by recognizing the sound.
So, in general, when do we need to use AI in a robot? When we need to emulate some sort of decision-making process that would be difficult or impossible to create with procedural steps (i.e., programming). It’s easy to see that neural networks are emulations of animal thought processes since they are a (greatly) simplified model of how neurons interact. Other AI techniques can be more difficult to understand.
One common theme could be that AI consistently uses programming by example as a technique to replace code with a common framework and variables with data. Instead of programming by process, we are programming by showing the software what result we want and having the software come up with how to get to that result. So for object recognition using pictures, we provide pictures of objects and the answer to what kind of object is represented by the picture. We repeat this over and over and train the software – by modifying the parameters in the code.
Another type of behavior we can create with AI has to do with behaviors. There are a lot of tasks that can be thought of as games. We can easily imagine how this works. Let’s say you want your children to pick up the toys in their room. You could command them to do it – which may or may not work. Or, you could make it a game by awarding points for each toy picked up, and giving a reward (such as giving a dollar) based on the number of points scored. What did we add by doing this? We added a metric, or measurement tool, to let the children know how well they are doing – a point system. And, more critically, we added a reward for specific behaviors. This can be a process we can use to modify or create behaviors in a robot. This is formally called reinforcement learning. While we can’t give a robot an emotional reward (as robots don’t have wants or needs), we can program the robot to seek to maximize a reward function. Then we can use the same process of making a small adjustment in parameters that change the reward, see whether that improves the score, and then either keep that change (when learning results in more reward, our reinforcement) or discard it if the score goes down. This type of process works well for robot motion, and for controlling robot arms.
I must tell you that the task set out in this book – to pick up toys in an unstructured environment – is nearly impossible to perform without AI techniques. It could be done by modifying the environment – say, by putting RFID tags in the toys – but not otherwise. That, then, is the purpose of this book – to show how certain tasks, which are difficult or impossible to solve otherwise, can be completed using the combination of AI and robotics.
Next, let’s discuss our robot and the development environment that we’ll be using in this book.