Our sample problem – clean up this room!
In the course of this book, we will be using a single problem set that I feel most people can relate to easily, while still representing a real challenge for the most seasoned roboticist. We will be using AI and robotics techniques to pick up toys in my house after my grandchildren have visited. That sound you just heard was the gasp from the professional robotics engineers and researchers in the audience – this is a tough problem. Why is this a tough problem, and why is it ideal for this book?
Let’s discuss the problem and break it down a bit. Later, in Chapter 2, we will do a full task analysis, learn how to write use cases, and create storyboards to develop our approach, but we can start here with some general observations.
Robotics designers first start with the environment – where does the robot work? We divide environments into two categories: structured and unstructured. A structured environment, like the playing field for a FIRST robotics competition (a contest for robots built by high school students in the US, where all of the playing field is known in advance), an assembly line, or a lab bench, has everything in an organized space. You might have heard the saying “A place for everything and everything in its place” – that is a structured environment. Another way to think about it is that we know in advance where everything is or is going to be. We know what color things are, where they are placed in space, and what shape they are. A name for this type of information is a priori knowledge – things we know in advance. Having advanced knowledge of the environment in robotics is sometimes absolutely essential. Assembly line robots expect parts to arrive in an exact position and orientation to be grasped and placed into position. In other words, we have arranged the world to suit the robot.
In the world of my house, this is simply not an option. If I could get my grandchildren to put their toys in exactly the same spot each time, then we would not need a robot for this task. We have a set of objects that are fairly fixed – we only have so many toys for them to play with. We occasionally add things or lose toys, or something falls down the stairs, but the toys are elements of a set of fixed objects. What they are not is positioned or oriented in any particular manner – they are just where they were left when the kids finished playing with them and went home. We also have a fixed set of furniture, but some parts move – the footstool or chairs can be moved around. This is an unstructured environment, where the robot and the software have to adapt, not the toys or furniture.
The problem is to have the robot drive around the room and pick up toys. Here are some objectives for this task:
- We want the user to interact with the robot by talking to it. We want the robot to understand what we want it to do, which is to say, what our intent is for the commands we are giving it.
- Once commanded to start, the robot will have to identify an object as being a toy or not being a toy. We only want to pick up toys.
- The robot must avoid hazards, the most important being the stairs going down to the first floor. Robots have a particular problem with negative obstacles (dropoffs, curbs, cliffs, stairs, etc.), and that is exactly what we have here.
- Once the robot finds a toy, it has to determine how to pick the toy up with its robot arm. Can it grasp the object directly, or must it scoop the item up, or push it along? We expect that the robot will try different ways to pick up toys and may need several trial-and-error attempts.
- Once the toy is picked up by the robot arm, the robot needs to carry the toy to a toy box. The robot must recognize the toy box in the room, remember where it is for repeat trips, and then position itself to place the toy in the box. Again, more than one attempt may be required.
- After the toy is dropped off, the robot returns to patrolling the room looking for more toys. At some point, hopefully, all of the toys will be retrieved. It may have to ask us, the human, whether the room is acceptable, or whether it needs to continue cleaning.
What will we learn from this problem? We will be using this backdrop to examine a variety of AI techniques and tools. The purpose of the book is to teach you how to develop AI solutions with robots. It is the process and the approach that is the critical information here, not the problem and not the robot I developed for the book. We will be demonstrating techniques for making a moving machine that can learn and adapt to its environment. I would expect that you will pick and choose which chapters to read and in which order, according to your interests and your needs, and as such, each of the chapters will be standalone lessons.
The first three chapters are foundation material that supports the rest of the book by setting up the problem and providing a firm framework to attach the rest of the material.
The basics of robotics
Not all of the chapters or topics in this book are considered classical AI approaches, but they do represent different ways of approaching machine learning and decision-making problems. We will be exploring together the following topics:
- Control theory and timing: We will build a firm foundation for robot control by understanding control theory and timing. We will be using a soft real-time control scheme with what I call a frame-based control loop. This technique has a fancy technical name – rate monotonic scheduling – but I think you will find the concept intuitive and easy to understand.
- OODA loop: At the most basic level, AI is a way for the robot to make decisions about its actions. We will introduce a model for decision-making that comes from the US Air Force, called the OODA loop. This describes how a robot (or a person) makes decisions. Our robot will have two of these loops, an inner loop or introspective loop, and an outward-looking environment sensor loop. The lower, inner loop takes priority over the slower, outer loop, just as the autonomic parts of your body (such as the heartbeat, breathing, and eating) take precedence over your task functions (such as going to work, paying bills, and mowing the yard). This makes our system a type of subsumption architecture, a biologically inspired control paradigm named by Rodney Brooks of MIT, one of the founders of iRobot and Rethink Robotics, and the designer of a robot named Baxter.
Figure 1.1 – My version of the OODA loop
Note
The OODA loop was invented by Col. John Boyd, a man also called The Father of the F-16. Col. Boyd’s ideas are still widely quoted today, and his OODA loop is used to describe robot AI, military planning, or marketing strategies with equal utility. OODA provides a model for how a thinking machine that interacts with its environment might work.
Our robot works not by simply following commands or instructions step by step but by setting goals and then working to achieve those goals. The robot is free to set its own path or determine how to get to its goal. We will tell the robot pick up that toy and the robot will decide which toy, how to get in range, and how to pick up the toy. If we, the human robot owner, instead tried to treat the robot as a teleoperated hand, we would have to give the robot many individual instructions, such as move forward, move right, extend arm, and open hand, each individually, and without giving the robot any idea why we were making those motions. In a goal-oriented structure, the robot will be aware of which objects are toys and which are not and it will know how to find the toy box and how to put toys in the box. This is the difference between an autonomous robot and a radio-controlled teleoperated device.
Before designing the specifics of our robot and its software, we have to match its capabilities to the environment and the problem it must solve. The book will introduce some tools for designing the robot and managing the development of the software. We will use two tools from the discipline of systems engineering to accomplish this – use cases and storyboards. I will make this process as streamlined as possible. More advanced types of systems engineering are used by NASA, aerospace, and automobile companies to design rockets, cars, and aircraft – this gives you a taste of those types of structured processes.
The techniques used in this book
The following sections will each detail step-by-step examples of applying AI techniques to a robotics problem:
- We start with object recognition. We need our robot to recognize objects, and then classify them as either toys to be picked up or not toys to be left alone. We will use a trained ANN to recognize objects from a video camera from various angles and lighting conditions. We will be using the process of transfer learning to extend an existing object recognition system, YOLOv8, to recognize our toys quickly and reliably.
- The next task, once a toy is identified, is to pick it up. Writing a general-purpose pick up anything program for a robot arm is a difficult task involving a lot of higher mathematics (use the internet to look up inverse kinematics to see what I mean). What if we let the robot sort this out for itself? We use genetic algorithms that permit the robot to invent its own behaviors and learn to use its arm on its own. Then we will employ deep reinforcement learning (DRL) to let the robot teach itself how to grasp various objects using an end effector (robot speak for a hand).
- Our robot needs to understand commands and instructions from its owner (us). We use natural language processing (NLP) to not just recognize speech but to understand our intent for the robot to create goals consistent with what we want it to do. We use a neat technique that I call the fill in the blank method to allow the robot to reason from the context of a command. This process is useful for a lot of robot planning tasks.
- The robot’s next problem is navigating rooms while avoiding the stairs and other hazards. We will use a combination of a unique, mapless navigation technique with 3D vision provided by a special stereo camera to see and avoid obstacles.
- The robot will need to be able to find the toy box to put items away, as well as have a general framework for planning moves in the future. We will use decision trees for path planning, as well as discussing pruning or quickly rejecting bad plans. If you imagine what a computer chess program algorithm must do, looking several moves ahead and scoring good moves versus bad moves before selecting a strategy, that will give you an idea of the power of this technique. This type of decision tree has many uses and can handle many dimensions of strategies. We’ll be using it as one of two ways to find a path to our toy box to put toys away.
- Our final task brings a different set of tools not normally used in robotics, or at least not the way we are going to employ them.
I have five wonderful, talented, and delightful grandchildren who love to come and visit. You’ll be hearing a lot about them throughout the book. The oldest grandson is 10 years old, and autistic, as is my granddaughter, the third child, who is 8, as well as the youngest boy, who is 6 as I write this. I introduced my eldest grandson, William, to the robot – and he immediately wanted to have a conversation with it. He asked, “What’s your name?” and “What do you do?” He was disappointed when the robot made no reply. So for the grandkids, we will be developing an engine for the robot to carry out a short conversation – we will be creating a robot personality to interact with children. William had one more request for this robot – he wants it to tell and respond to knock, knock jokes, so we will use that as a prototype of special dialog.
While developing a robot with actual feelings is far beyond the state of the art in robotics or AI today, we can simulate having a personality with a finite state machine and some Monte Carlo modeling. We will also give the robot a model for human interaction so that the robot will take into account the child’s mood as well. I like to call this type of software an AP to distinguish it from our AI. AI builds a model of thinking, and an AP builds a model of emotion for our robot.
Now that you’re familiar with the problem we will be addressing in this book, let’s briefly discuss when and why you might need AI for your robot.