Chapter 1, A Fast Introduction to Computer Vision, gives a brief overview of what constitutes computer vision, its applications in different fields and subdivision of different type problems. The chapter also covers basic image input reading with code in OpenCV. There is also an overview of different color spaces and their visualizations.
Chapter 2, Libraries, Development Platforms, and Datasets, provides detailed instructions on how to set up a development environment and install libraries inside it. The various datasets introduced in this chapter include both that will be used in this book as well as currently popular datasets for each sub-domain of computer vision. The chapter includes links for downloading and loading wrappers to be used libraries such as Keras.
Chapter 3, Image Filtering and Transformations in OpenCV, explains different filtering techniques, including linear and nonlinear filters, their implementation in OpenCV. This chapter also includes techniques for transforming an image, such as linear translation, rotation around a given axis, and complete affine transformation. The techniques introduced in the chapter help in creating applications across several domains and enhancing image quality.
Chapter 4, What is a Feature? introduces the features and their importance in various applications in computer vision. The chapter consists of Harris Corner Detectors with basic features, the fast feature detector, and ORB features for both robust and fast features. There are also demonstrations in OpenCV of applications that use these. The applications include matching a template to the original image and matching two images of the same object. There is also a discussion of the black box feature and its necessity.
Chapter 5, Convolutional Neural Networks, begins with an introduction to simple neural networks and their components. The chapter also introduces convolutional neural networks in Keras with various components such as activation, pooling, and fully-connected. Results with parameter changes for each component are explained; these can be easily reproduced by the reader. This understanding is further strengthened by implementing a simple CNN model using an image dataset. Along with popular CNN architectures, VGG, Inception, and ResNet, there is an introduction to transfer learning. This leads to a look at state-of-the-art deep learning models for image classification.
Chapter 6, Feature-Based Object Detection, develops an understanding of the image recognition problem. Detection algorithms, such as face detectors, are explained with OpenCV. You will also see some recent and popular deep learning-based object detection algorithms such as FasterRCNN, SSD, and others. The effectiveness of each of these is explained with TensorFlow object detection API on custom images.
Chapter 7, Segmentation and Tracking, consists of two parts. The first introduces the image instance recognition problem, with an implementation of the deep learning model for segmentation. The second part begins with an introduction to the MOSSE tracker from OpenCV, which is both efficient and fast. An introduction to the deep learning-based tracking of multiple objects is described in tracking.
Chapter 8, 3D Computer Vision, describes analyzing images from a geometrical point of view. Readers will first understand the challenges in computing depth from a single image, and later learn how to solve them using multiple images. The chapter also describes the way to track a camera pose for moving cameras using visual odometry. Lastly, the SLAM problem is introduced, with solutions presented using the visual SLAM technique, which uses only camera images as input.
Appendix A, Mathematics for Computer Vision, introduces basic concepts required in understanding computer vision algorithms. Matrix and vector operations introduced here are further augmented with Python implementations. The appendix also contains an introduction to probability theory with explanations to various distributions.
Appendix B, Machine Learning for Computer Vision, gives an overview of machine learning modeling and various key terms involved. The readers will also understand the curse of dimensionality, the various preprocessing and postprocessing involved. There are also explanation on several evaluation tools and methods for machine learning models which are also used quite extensively for vision applications