You're reading from Hands-On Vision and Behavior for Self-Driving Cars Explore visual perception, lane detection, and object classification with Python 3 and OpenCV 4

Product type Paperback

Published in Oct 2020

Publisher Packt

ISBN-13 9781800203587

Length 374 pages

Edition 1st Edition

Languages

Python

Tools

OpenCV

Concepts

Computer Vision

Authors (2):

Krishtof Korda

Luca Venturi

View More author details

Table of Contents (17) Chapters

Preface

1. Section 1: OpenCV and Sensors and Signals

2. Chapter 1: OpenCV Basics and Camera Calibration FREE CHAPTER

3. Chapter 2: Understanding and Working with Signals

4. Chapter 3: Lane Detection

5. Section 2: Improving How the Self-Driving Car Works with Deep Learning and Neural Networks

6. Chapter 4: Deep Learning with Neural Networks

7. Chapter 5: Deep Learning Workflow

8. Chapter 6: Improving Your Neural Network

9. Chapter 7: Detecting Pedestrians and Traffic Lights

10. Chapter 8: Behavioral Cloning

11. Chapter 9: Semantic Segmentation

12. Section 3: Mapping and Controls

13. Chapter 10: Steering, Throttle, and Brake Control

14. Chapter 11: Mapping Our Environments

15. Assessments

16. Other Books You May Enjoy

Leave a review - let other readers know what you think

Introduction to OpenCV and NumPy

OpenCV is a computer vision and machine learning library that has been developed for more than 20 years and provides an impressive number of functionalities. Despite some inconsistencies in the API, its simplicity and the remarkable number of algorithms implemented make it an extremely popular library and an excellent choice for many situations.

OpenCV is written in C++, but there are bindings for Python, Java, and Android.

In this book, we will focus on OpenCV for Python, with all the code tested using OpenCV 4.2.

OpenCV in Python is provided by opencv-python, which can be installed using the following command:

pip install opencv-python

OpenCV can take advantage of hardware acceleration, but to get the best performance, you might need to build it from the source code, with different flags than the default, to optimize it for your target hardware.

OpenCV and NumPy

The Python bindings use NumPy, which increases the flexibility and makes it compatible with many other libraries. As an OpenCV image is a NumPy array, you can use normal NumPy operations to get information about the image. A good understanding of NumPy can improve the performance and reduce the length of your code.

Let's dive right in with some quick examples of what you can do with NumPy in OpenCV.

Image size

The size of the image can be retrieved using the shape attribute:

print("Image size: ", image.shape)

For a grayscale image of 50x50, image.shape() would return the tuple (50, 50), while for an RGB image, the result would be (50, 50, 3).

False friends

In NumPy, the attribute size is the size in bytes of the array; for a 50x50 gray image, it would be 2,500, while for the same image in RGB, it would be 7,500. It's the shape attribute that contains the size of the image – (50, 50) and (50, 50, 3), respectively.

Grayscale images

Grayscale images are represented by a two-dimensional NumPy array. The first index affects the rows (y coordinate) and the second index the columns (x coordinate). The y coordinates have their origin in the top corner of the image and x coordinates have their origin in the left corner of the image.

It is possible to create a black image using np.zeros(), which initializes all the pixels to 0:

black = np.zeros([100,100],dtype=np.uint8)  # Creates a black image

The previous code creates a grayscale image with size (100, 100), composed of 10,000 unsigned bytes (dtype=np.uint8).

To create an image with pixels with a different value than 0, you can use the full() method:

white = np.full([50, 50], 255, dtype=np.uint8)

To change the color of all the pixels at once, it's possible to use the [:] notation:

img[:] = 64        # Change the pixels color to dark gray

To affect only some rows, it is enough to provide a range of rows in the first index:

img[10:20] = 192   # Paints 10 rows with light gray

The previous code changes the color of rows 10-20, including row 10, but excluding row 20.

The same mechanism works for columns; you just need to specify the range in the second index. To instruct NumPy to include a full index, we use the [:] notation that we already encountered:

img[:, 10:20] = 64 # Paints 10 columns with dark gray

You can also combine operations on rows and columns, selecting a rectangular area:

img[90:100, 90:100] = 0  # Paints a 10x10 area with black

It is, of course, possible to operate on a single pixel, as you would do on a normal array:

img[50, 50] = 0  # Paints one pixel with black

It is possible to use NumPy to select a part of an image, also called the Region Of Interest (ROI). For example, the following code copies a 10x10 ROI from the position (90, 90) to the position (80, 80):

roi = img[90:100, 90:100]
img[80:90, 80:90] = roi

The following is the result of the previous operations:

Figure 1.1 – Some manipulation of images using NumPy slicing

To make a copy of an image, you can simply use the copy() method:

image2 = image.copy()

RGB images

RGB images differ from grayscale because they are three-dimensional, with the third index representing the three channels. Please note that OpenCV stores the images in BGR format, not RGB, so channel 0 is blue, channel 1 is green, and channel 2 is red.

Important note

OpenCV stores the images as BGR, not RGB. In the rest of the book, when talking about RGB images, it will only mean that it is a 24-bit color image, but the internal representation will usually be BGR.

To create an RGB image, we need to provide three sizes:

rgb = np.zeros([100, 100, 3],dtype=np.uint8)

If you were going to run the same code previously used on the grayscale image with the new RGB image (skipping the third index), you would get the same result. This is because NumPy would apply the same color to all the three channels, which results in a shade of gray.

To select a color, it is enough to provide the third index:

rgb[:, :, 2] = 255       # Makes the image red

In NumPy, it is also possible to select rows, columns, or channels that are not contiguous. You can do this by simply providing a tuple with the required indexes. To make the image magenta, you need to set the blue and red channels to 255, which can be achieved with the following code:

rgb[:, :, (0, 2)] = 255  # Makes the image magenta

You can convert an RGB image into grayscale using cvtColor():

gray = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY)

You're reading from Hands-On Vision and Behavior for Self-Driving Cars Explore visual perception, lane detection, and object classification with Python 3 and OpenCV 4

Table of Contents (17) Chapters

Introduction to OpenCV and NumPy

OpenCV and NumPy

Image size

Grayscale images

RGB images

Authors (2)

Other recommended products

Personalised recommendations for you