Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
OpenCV 4 with Python Blueprints
OpenCV 4 with Python Blueprints

OpenCV 4 with Python Blueprints: Build creative computer vision projects with the latest version of OpenCV 4 and Python 3 , Second Edition

Arrow left icon
Profile Icon Dr. Menua Gevorgyan Profile Icon Michael Beyeler (USD) Profile Icon Mamikonyan Profile Icon Michael Beyeler
Arrow right icon
Can$44.98 Can$49.99
Full star icon Full star icon Full star icon Full star icon Full star icon 5 (4 Ratings)
eBook Mar 2020 366 pages 2nd Edition
eBook
Can$44.98 Can$49.99
Paperback
Can$61.99
Subscription
Free Trial
Arrow left icon
Profile Icon Dr. Menua Gevorgyan Profile Icon Michael Beyeler (USD) Profile Icon Mamikonyan Profile Icon Michael Beyeler
Arrow right icon
Can$44.98 Can$49.99
Full star icon Full star icon Full star icon Full star icon Full star icon 5 (4 Ratings)
eBook Mar 2020 366 pages 2nd Edition
eBook
Can$44.98 Can$49.99
Paperback
Can$61.99
Subscription
Free Trial
eBook
Can$44.98 Can$49.99
Paperback
Can$61.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

OpenCV 4 with Python Blueprints

Hand Gesture Recognition Using a Kinect Depth Sensor

The goal of this chapter is to develop an app that detects and tracks simple hand gestures in real time, using the output of a depth sensor, such as that of a Microsoft Kinect 3D sensor or an ASUS Xtion sensor. The app will analyze each captured frame to perform the following tasks:

  • Hand region segmentation: The user's hand region will be extracted in each frame by analyzing the depth map output of the Kinect sensor, which is done by thresholding, applying some morphological operations, and finding connected components.
  • Hand shape analysis: The shape of the segmented hand region will be analyzed by determining contours, convex hull, and convexity defects.
  • Hand gesture recognition: The number of extended fingers will be determined based on the hand contour's convexity defects, and the gesture will be classified accordingly (with no extended fingers corresponding to a fist, and five extended fingers corresponding to an open hand).

Gesture recognition is an ever-popular topic in computer science. This is because it not only enables humans to communicate with machines (Human-Machine Interaction (HMI)) but also constitutes the first step for machines to begin understanding human body language. With affordable sensors such as Microsoft Kinect or Asus Xtion and open source software such as OpenKinect and OpenNI, it has never been easier to get started in the field yourself. So, what shall we do with all this technology?

In this chapter, we will cover the following topics:

  • Planning the app
  • Setting up the app
  • Tracking hand gestures in real time
  • Understanding hand region segmentation
  • Performing hand shape analysis
  • Performing hand gesture recognition

The beauty of the algorithm that we are going to implement in this chapter is that it works well for many hand gestures, yet it is simple enough to run in real time on a generic laptop. Also, if we want, we can easily extend it to incorporate more complicated hand-pose estimations.

Once you complete the app, you will understand how to use depth sensors in your own apps. You will learn how to compose shapes of interest with OpenCV from the depth information, as well as understanding how to analyze shapes with OpenCV, using their geometric properties.

Getting started

This chapter requires you to have a Microsoft Kinect 3D sensor installed. Alternatively, you may install an Asus Xtion sensor or any other depth sensor for which OpenCV has built-in support.

First, install OpenKinect and libfreenect from http://www.openkinect.org/wiki/Getting_Started. You can find the code that we present in this chapter at our GitHub repository: https://github.com/PacktPublishing/OpenCV-4-with-Python-Blueprints-Second-Edition/tree/master/chapter2.

Let's first plan the application we are going to create in this chapter.

Planning the app

The final app will consist of the following modules and scripts:

  • gestures: This is a module that consists of an algorithm for recognizing hand gestures.
  • gestures.process: This is a function that implements the entire process flow of hand gesture recognition. It accepts a single-channel depth image (acquired from the Kinect depth sensor) and returns an annotated Blue, Green, Red (BGR) color image with an estimated number of extended fingers.
  • chapter2: This is the main script for the chapter.
  • chapter2.main: This is the main function routine that iterates over frames acquired from a depth sensor that uses .process gestures to process frames, and then illustrates results.

The end product looks like this:

No matter how many fingers of a hand are extended, the algorithm correctly segments the hand region (white), draws the corresponding convex hull (the green line surrounding the hand), finds all convexity defects that belong to the spaces between fingers (large green points) while ignoring others (small red points), and infers the correct number of extended fingers (the number in the bottom-right corner), even for a fist.

Now, let's set up the application in the next section.

Setting up the app

Before we can get down to the nitty-gritty of our gesture recognition algorithm, we need to make sure that we can access the depth sensor and display a stream of depth frames. In this section, we will cover the following things that will help us set up the app:

  • Accessing the Kinect 3D sensor
  • Utilizing OpenNI-compatible sensors
  • Running the app and main function routine

First, we will look at how to use the Kinect 3D sensor.

Accessing the Kinect 3D sensor

The easiest way to access a Kinect sensor is by using an OpenKinect module called freenect. For installation instructions, take a look at the preceding section.

The freenect module has functions such as sync_get_depth() and sync_get_video(), used to obtain images synchronously from the depth sensor and camera sensor respectively. For this chapter, we will need only the Kinect depth map, which is a single-channel (grayscale) image in which each pixel value is the estimated distance from the camera to a particular surface in the visual scene.

Here, we will design a function that will read a frame from the sensor and convert it to the desired format, and return the frame together with a success status, as follows:

def read_frame(): -> Tuple[bool,np.ndarray]:

The function consists of the following steps:

  1. Grab a frame; terminate the function if a frame was not acquired, like this:
    frame, timestamp = freenect.sync_get_depth() 
if frame is None:
return False, None

The sync_get_depth method returns both the depth map and a timestamp. By default, the map is in an 11-bit format. The last 10 bits of the sensor describes the depth, while the first bit states that the distance estimation was not successful when it's equal to 1.

  1. It is a good idea to standardize the data into an 8-bit precision format, as an 11-bit format is inappropriate to be visualized with cv2.imshow right away, as well as in the future. We might want to use some different sensor that returns in a different format, as follows:
np.clip(depth, 0, 2**10-1, depth) 
depth >>= 2 

In the previous code, we have first clipped the values to 1,023 (or 2**10-1) to fit in 10 bits. Such clipping results in the assignment of the undetected distance to the farthest possible point. Next, we shift 2 bits to the right to fit the distance in 8 bits.

  1. Finally, we convert the image into an 8-bit unsigned integer array and return the result, as follows:
return True, depth.astype(np.uint8) 

Now, the depth image can be visualized as follows:

cv2.imshow("depth", read_frame()[1]) 

Let's see how to use OpenNI-compatible sensors in the next section.

Utilizing OpenNI-compatible sensors

To use an OpenNI-compatible sensor, you must first make sure that OpenNI2 is installed and that your version of OpenCV was built with the support of OpenNI. The build information can be printed as follows:

import cv2
print(cv2.getBuildInformation())

If your version was built with OpenNI support, you will find it under the Video I/O section. Otherwise, you will have to rebuild OpenCV with OpenNI support, which is done by passing the -D WITH_OPENNI2=ON flag to cmake.

After the installation process is complete, you can access the sensor similarly to other video input devices, using cv2.VideoCapture. In this app, in order to use an OpenNI-compatible sensor instead of a Kinect 3D sensor, you have to cover the following steps:

  1. Create a video capture that connects to your OpenNI-compatible sensor, like this:
device = cv2.cv.CV_CAP_OPENNI 
capture = cv2.VideoCapture(device) 

If you want to connect to Asus Xtion, the device variable should be assigned to the cv2.CV_CAP_OPENNI_ASUS value instead.

  1. Change the input frame size to the standard Video Graphics Array (VGA) resolution, as follows:
capture.set(cv2.cv.CV_CAP_PROP_FRAME_WIDTH, 640) 
capture.set(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT, 480) 
  1. In the previous section, we designed the read_frame function, which accesses the Kinect sensor using freenect. In order to read depth images from the video capture, you have to change that function to the following one:
def read_frame():
if not capture.grab():
return False,None
return capture.retrieve(cv2.CAP_OPENNI_DEPTH_MAP)

You will note that we have used the grab and retrieve methods instead of the read method. The reason is that the read method of cv2.VideoCapture is inappropriate when we need to synchronize a set of cameras or a multi-head camera, such as a Kinect.

For such cases, you grab frames from multiple sensors at a certain moment in time with the grab method and then retrieve the data of the sensors of interest with the retrieve method. For example, in your own apps, you might also need to retrieve a BGR frame (standard camera frame), which can be done by passing cv2.CAP_OPENNI_BGR_IMAGE to the retrieve method.

So, now that you can read data from your sensor, let's see how to run the application in the next section.

Running the app and main function routine

The chapter2.py script is responsible for running the app, and it first imports the following modules:

import cv2
import numpy as np
from gestures import recognize
from frame_reader import read_frame

The recognize function is responsible for recognizing a hand gesture, and we will compose it later in this chapter. We have also placed the read_frame method that we composed in the previous section in a separate script, for convenience.

In order to simplify the segmentation task, we will instruct the user to place their hand in the center of the screen. To provide a visual aid for this, we create the following function:

def draw_helpers(img_draw: np.ndarray) -> None:
# draw some helpers for correctly placing hand
height, width = img_draw.shape[:2]
color = (0,102,255)
cv2.circle(img_draw, (width // 2, height // 2), 3, color, 2)
cv2.rectangle(img_draw, (width // 3, height // 3),
(width * 2 // 3, height * 2 // 3), color, 2)

The function draws a rectangle around the image center and highlights the center pixel of the image in orange.

All the heavy lifting is done by the main function, shown in the following code block:

def main():
for _, frame in iter(read_frame, (False, None)):

The function iterates over grayscale frames from Kinect, and, in each iteration, it covers the following steps:

  1. Recognize hand gestures using the recognize function, which returns the estimated number of extended fingers (num_fingers) and an annotated BGR color image, as follows:
num_fingers, img_draw = recognize(frame)
  1. Call the draw_helpers function on the annotated BGR image in order to provide a visual aid for hand placement, as follows:
 draw_helpers(img_draw)
  1. Finally, the main function draws the number of fingers on the annotated frame, displays results with cv2.imshow, and sets termination criteria, as follows:
        # print number of fingers on image
cv2.putText(img_draw, str(num_fingers), (30, 30),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255))
cv2.imshow("frame", img_draw)
# Exit on escape
if cv2.waitKey(10) == 27:
break

So, now that we have the main script, you will note that the only function that we are missing is the recognize function. In order to track hand gestures, we need to compose this function, which we will do in the next section.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Understand how to capture high-quality image data, detect and track objects, and process the actions of animals or humans
  • Implement your learning in different areas of computer vision
  • Explore advanced concepts in OpenCV such as machine learning, artificial neural network, and augmented reality

Description

OpenCV is a native cross-platform C++ library for computer vision, machine learning, and image processing. It is increasingly being adopted in Python for development. This book will get you hands-on with a wide range of intermediate to advanced projects using the latest version of the framework and language, OpenCV 4 and Python 3.8, instead of only covering the core concepts of OpenCV in theoretical lessons. This updated second edition will guide you through working on independent hands-on projects that focus on essential OpenCV concepts such as image processing, object detection, image manipulation, object tracking, and 3D scene reconstruction, in addition to statistical learning and neural networks. You’ll begin with concepts such as image filters, Kinect depth sensor, and feature matching. As you advance, you’ll not only get hands-on with reconstructing and visualizing a scene in 3D but also learn to track visually salient objects. The book will help you further build on your skills by demonstrating how to recognize traffic signs and emotions on faces. Later, you’ll understand how to align images, and detect and track objects using neural networks. By the end of this OpenCV Python book, you’ll have gained hands-on experience and become proficient at developing advanced computer vision apps according to specific business needs.

Who is this book for?

This book is for intermediate-level OpenCV users who are looking to enhance their skills by developing advanced applications. Familiarity with OpenCV concepts and Python libraries, and basic knowledge of the Python programming language are assumed.

What you will learn

  • Generate real-time visual effects using filters and image manipulation techniques such as dodging and burning
  • Recognize hand gestures in real-time and perform hand-shape analysis based on the output of a Microsoft Kinect sensor
  • Learn feature extraction and feature matching to track arbitrary objects of interest
  • Reconstruct a 3D real-world scene using 2D camera motion and camera reprojection techniques
  • Detect faces using a cascade classifier and identify emotions in human faces using multilayer perceptrons
  • Classify, localize, and detect objects with deep neural networks

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Mar 20, 2020
Length: 366 pages
Edition : 2nd
Language : English
ISBN-13 : 9781789617634
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Mar 20, 2020
Length: 366 pages
Edition : 2nd
Language : English
ISBN-13 : 9781789617634
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just Can$6 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just Can$6 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total Can$ 187.97
OpenCV 4 with Python Blueprints
Can$61.99
Learning OpenCV 4 Computer Vision with Python 3
Can$63.99
Python Image Processing Cookbook
Can$61.99
Total Can$ 187.97 Stars icon

Table of Contents

13 Chapters
Fun with Filters Chevron down icon Chevron up icon
Hand Gesture Recognition Using a Kinect Depth Sensor Chevron down icon Chevron up icon
Finding Objects via Feature Matching and Perspective Transforms Chevron down icon Chevron up icon
3D Scene Reconstruction Using Structure from Motion Chevron down icon Chevron up icon
Using Computational Photography with OpenCV Chevron down icon Chevron up icon
Tracking Visually Salient Objects Chevron down icon Chevron up icon
Learning to Recognize Traffic Signs Chevron down icon Chevron up icon
Learning to Recognize Facial Emotions Chevron down icon Chevron up icon
Learning to Classify and Localize Objects Chevron down icon Chevron up icon
Learning to Detect and Track Objects Chevron down icon Chevron up icon
Profiling and Accelerating Your Apps Chevron down icon Chevron up icon
Setting Up a Docker Container Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Full star icon 5
(4 Ratings)
5 star 100%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
David Buniatyan Nov 20, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book goes through application-specific examples and brings advanced concepts within their context. Ideal for practitioners who are interested in quickly seeing results and learning computer vision techniques as they build.
Amazon Verified review Amazon
David Oct 01, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I really enjoyed the book. As a Machine Learning R&D engineer with about 5 years of experience I think it is well written and best suited for anyone who interested in computer vision, but mostly for the ones who are looking to build their own app. Let me go through the points I liked:- Each chapter is built around an independent application, such as Finding an Object or Facial Emotions detection. This opens a reader the possibility to start right off from the problem they have in mind.- The book describes neatly and with a good visual support important ML algorithms such as SVM, backpropagation, CNNs- The book contains guides and instructions not only from the algorithmic side, but from the developer's as well. Thus setting up an environment will not cause any problems.
Amazon Verified review Amazon
Matthew Emerick Apr 17, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Disclaimer: The publisher asked me to review this book and gave me a review copy. I promise to be 100% honest in how I feel about this book, both the good and the less so.Overview:This book is for the intermediate or advanced Python programmer who is interested in computer vision with OpenCV. It's a computer vision book that lets you build projects to gain hands on experience in the field as well as a portfolio to show off to get hired. Outside of the libraries the book lists, you will need a depth sensor for the project in Chapter 2. If you can't or won't buy one, you can still do the remaining projects.What I Like:The very first chapter gives a good introduction to using OpenCV with some traditional computer vision work. This gives the reader some confidence when moving forward. The rest of the book flows from intermediate to advanced quite well with increasingly more difficult programs. Each chapter's project focuses on a different ability of the OpenCV library.What I Didn't LikeThere isn't much to dislike about this book. I do wish that there were more projects, I'm a bit of a sucker for that. I also wish that there were suggestions as how to further extend the code or additional code exercises. That would have been great for the programmers transitioning from students with textbooks to software developers.What I Would Like to SeeI touched upon this in the last section, but the book could have been longer. Or it could have been broken up into two books of the same length, one for intermediate programmer and the other for advanced. Thankfully, there are other books to fill in the space, as well as plenty to find online.Overall, this is a great book. I give it an easy 5 stars out of 5. It does exactly what is says it is going to and fills a gap that I had been trying to fill. Well done!
Amazon Verified review Amazon
SJ Sep 07, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The authors did a good job of explaining various applications that can be build using OpenCV. OpenCV is a widely used Vision-based library but finding a proper guide that can explain all applications is hard to find. If you are looking for a place for a good start to learn traditional vision and write them as code, this is a book to go.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.