You're reading from Hands-On Vision and Behavior for Self-Driving Cars Explore visual perception, lane detection, and object classification with Python 3 and OpenCV 4

Product type Paperback

Published in Oct 2020

Publisher Packt

ISBN-13 9781800203587

Length 374 pages

Edition 1st Edition

Languages

Python

Tools

OpenCV

Concepts

Computer Vision

Authors (2):

Krishtof Korda

Luca Venturi

View More author details

Table of Contents (17) Chapters

Preface

1. Section 1: OpenCV and Sensors and Signals

2. Chapter 1: OpenCV Basics and Camera Calibration FREE CHAPTER

3. Chapter 2: Understanding and Working with Signals

4. Chapter 3: Lane Detection

5. Section 2: Improving How the Self-Driving Car Works with Deep Learning and Neural Networks

6. Chapter 4: Deep Learning with Neural Networks

7. Chapter 5: Deep Learning Workflow

8. Chapter 6: Improving Your Neural Network

9. Chapter 7: Detecting Pedestrians and Traffic Lights

10. Chapter 8: Behavioral Cloning

11. Chapter 9: Semantic Segmentation

12. Section 3: Mapping and Controls

13. Chapter 10: Steering, Throttle, and Brake Control

14. Chapter 11: Mapping Our Environments

15. Assessments

16. Other Books You May Enjoy

Leave a review - let other readers know what you think

Segmenting images with CNN

A typical semantic segmentation task receives as input an RGB image and needs to output an image with the raw segmentation, but this solution could be problematic. We already know that classifiers generate their results using one-hot encoded labels, and we can do the same for semantic segmentation: instead of generating a single image with the raw segmentation, the network can create a series of one-hot encoded images. In our case, as we need 13 classes, the network will output 13 RGB images, one per label, with the following features:

One image describes only one label.
The pixels belonging to the label have a value of 1 in the red channel, while all the other pixels are marked as 0.

Each given pixel can be 1 only in one image; it will be 0 in all the remaining images. This is a difficult task, but it does not necessarily require particular architectures: a series of convolutional layers with same padding can do it; however, their cost...