The final app will consist of the following modules and scripts:
- gestures: This is a module that consists of an algorithm for recognizing hand gestures.
- gestures.process: This is a function that implements the entire process flow of hand gesture recognition. It accepts a single-channel depth image (acquired from the Kinect depth sensor) and returns an annotated Blue, Green, Red (BGR) color image with an estimated number of extended fingers.
- chapter2: This is the main script for the chapter.
- chapter2.main: This is the main function routine that iterates over frames acquired from a depth sensor that uses .process gestures to process frames, and then illustrates results.
The end product looks like this:
No matter how many fingers of a hand are extended, the algorithm correctly segments the hand region (white), draws the corresponding convex hull (the green line surrounding the hand), finds all convexity defects that belong to the spaces between fingers (large green points) while ignoring others (small red points), and infers the correct number of extended fingers (the number in the bottom-right corner), even for a fist.
Now, let's set up the application in the next section.