The goal of this chapter is to develop an app that detects and tracks simple hand gestures in real time, using the output of a depth sensor, such as that of a Microsoft Kinect 3D sensor or an ASUS Xtion sensor. The app will analyze each captured frame to perform the following tasks:
- Hand region segmentation: The user's hand region will be extracted in each frame by analyzing the depth map output of the Kinect sensor, which is done by thresholding, applying some morphological operations, and finding connected components.
- Hand shape analysis: The shape of the segmented hand region will be analyzed by determining contours, convex hull, and convexity defects.
- Hand gesture recognition: The number of extended fingers will be determined based on the hand contour's convexity defects, and the gesture will be classified accordingly (with no extended fingers corresponding to a fist, and five extended fingers corresponding to an open hand).
Gesture recognition is an ever-popular topic in computer science. This is because it not only enables humans to communicate with machines (Human-Machine Interaction (HMI)) but also constitutes the first step for machines to begin understanding human body language. With affordable sensors such as Microsoft Kinect or Asus Xtion and open source software such as OpenKinect and OpenNI, it has never been easier to get started in the field yourself. So, what shall we do with all this technology?
In this chapter, we will cover the following topics:
- Planning the app
- Setting up the app
- Tracking hand gestures in real time
- Understanding hand region segmentation
- Performing hand shape analysis
- Performing hand gesture recognition
The beauty of the algorithm that we are going to implement in this chapter is that it works well for many hand gestures, yet it is simple enough to run in real time on a generic laptop. Also, if we want, we can easily extend it to incorporate more complicated hand-pose estimations.
Once you complete the app, you will understand how to use depth sensors in your own apps. You will learn how to compose shapes of interest with OpenCV from the depth information, as well as understanding how to analyze shapes with OpenCV, using their geometric properties.