Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On Vision and Behavior for Self-Driving Cars

You're reading from   Hands-On Vision and Behavior for Self-Driving Cars Explore visual perception, lane detection, and object classification with Python 3 and OpenCV 4

Arrow left icon
Product type Paperback
Published in Oct 2020
Publisher Packt
ISBN-13 9781800203587
Length 374 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Krishtof Korda Krishtof Korda
Author Profile Icon Krishtof Korda
Krishtof Korda
Luca Venturi Luca Venturi
Author Profile Icon Luca Venturi
Luca Venturi
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Section 1: OpenCV and Sensors and Signals
2. Chapter 1: OpenCV Basics and Camera Calibration FREE CHAPTER 3. Chapter 2: Understanding and Working with Signals 4. Chapter 3: Lane Detection 5. Section 2: Improving How the Self-Driving Car Works with Deep Learning and Neural Networks
6. Chapter 4: Deep Learning with Neural Networks 7. Chapter 5: Deep Learning Workflow 8. Chapter 6: Improving Your Neural Network 9. Chapter 7: Detecting Pedestrians and Traffic Lights 10. Chapter 8: Behavioral Cloning 11. Chapter 9: Semantic Segmentation 12. Section 3: Mapping and Controls
13. Chapter 10: Steering, Throttle, and Brake Control 14. Chapter 11: Mapping Our Environments 15. Assessments 16. Other Books You May Enjoy

Introduction to OpenCV and NumPy

OpenCV is a computer vision and machine learning library that has been developed for more than 20 years and provides an impressive number of functionalities. Despite some inconsistencies in the API, its simplicity and the remarkable number of algorithms implemented make it an extremely popular library and an excellent choice for many situations.

OpenCV is written in C++, but there are bindings for Python, Java, and Android.

In this book, we will focus on OpenCV for Python, with all the code tested using OpenCV 4.2.

OpenCV in Python is provided by opencv-python, which can be installed using the following command:

pip install opencv-python

OpenCV can take advantage of hardware acceleration, but to get the best performance, you might need to build it from the source code, with different flags than the default, to optimize it for your target hardware.

OpenCV and NumPy

The Python bindings use NumPy, which increases the flexibility and makes it compatible with many other libraries. As an OpenCV image is a NumPy array, you can use normal NumPy operations to get information about the image. A good understanding of NumPy can improve the performance and reduce the length of your code.

Let's dive right in with some quick examples of what you can do with NumPy in OpenCV.

Image size

The size of the image can be retrieved using the shape attribute:

print("Image size: ", image.shape)

For a grayscale image of 50x50, image.shape() would return the tuple (50, 50), while for an RGB image, the result would be (50, 50, 3).

False friends

In NumPy, the attribute size is the size in bytes of the array; for a 50x50 gray image, it would be 2,500, while for the same image in RGB, it would be 7,500. It's the shape attribute that contains the size of the image – (50, 50) and (50, 50, 3), respectively.

Grayscale images

Grayscale images are represented by a two-dimensional NumPy array. The first index affects the rows (y coordinate) and the second index the columns (x coordinate). The y coordinates have their origin in the top corner of the image and x coordinates have their origin in the left corner of the image.

It is possible to create a black image using np.zeros(), which initializes all the pixels to 0:

black = np.zeros([100,100],dtype=np.uint8)  # Creates a black image

The previous code creates a grayscale image with size (100, 100), composed of 10,000 unsigned bytes (dtype=np.uint8).

To create an image with pixels with a different value than 0, you can use the full() method:

white = np.full([50, 50], 255, dtype=np.uint8)

To change the color of all the pixels at once, it's possible to use the [:] notation:

img[:] = 64        # Change the pixels color to dark gray

To affect only some rows, it is enough to provide a range of rows in the first index:

img[10:20] = 192   # Paints 10 rows with light gray

The previous code changes the color of rows 10-20, including row 10, but excluding row 20.

The same mechanism works for columns; you just need to specify the range in the second index. To instruct NumPy to include a full index, we use the [:] notation that we already encountered:

img[:, 10:20] = 64 # Paints 10 columns with dark gray

You can also combine operations on rows and columns, selecting a rectangular area:

img[90:100, 90:100] = 0  # Paints a 10x10 area with black

It is, of course, possible to operate on a single pixel, as you would do on a normal array:

img[50, 50] = 0  # Paints one pixel with black

It is possible to use NumPy to select a part of an image, also called the Region Of Interest (ROI). For example, the following code copies a 10x10 ROI from the position (90, 90) to the position (80, 80):

roi = img[90:100, 90:100]
img[80:90, 80:90] = roi 

The following is the result of the previous operations:

Figure 1.1 – Some manipulation of images using NumPy slicing

Figure 1.1 – Some manipulation of images using NumPy slicing

To make a copy of an image, you can simply use the copy() method:

image2 = image.copy()

RGB images

RGB images differ from grayscale because they are three-dimensional, with the third index representing the three channels. Please note that OpenCV stores the images in BGR format, not RGB, so channel 0 is blue, channel 1 is green, and channel 2 is red.

Important note

OpenCV stores the images as BGR, not RGB. In the rest of the book, when talking about RGB images, it will only mean that it is a 24-bit color image, but the internal representation will usually be BGR.

To create an RGB image, we need to provide three sizes:

rgb = np.zeros([100, 100, 3],dtype=np.uint8)  

If you were going to run the same code previously used on the grayscale image with the new RGB image (skipping the third index), you would get the same result. This is because NumPy would apply the same color to all the three channels, which results in a shade of gray.

To select a color, it is enough to provide the third index:

rgb[:, :, 2] = 255       # Makes the image red

In NumPy, it is also possible to select rows, columns, or channels that are not contiguous. You can do this by simply providing a tuple with the required indexes. To make the image magenta, you need to set the blue and red channels to 255, which can be achieved with the following code:

rgb[:, :, (0, 2)] = 255  # Makes the image magenta

You can convert an RGB image into grayscale using cvtColor():

gray = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY)
You have been reading a chapter from
Hands-On Vision and Behavior for Self-Driving Cars
Published in: Oct 2020
Publisher: Packt
ISBN-13: 9781800203587
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime