You're reading from Modern Computer Vision with PyTorch A practical roadmap from deep learning fundamentals to advanced applications and Generative AI

Product type Paperback

Published in Jun 2024

Publisher Packt

ISBN-13 9781803231334

Length 746 pages

Edition 2nd Edition

Languages

Python

Tools

PyTorch

Concepts

Computer Vision

Authors (2):

V Kishore Ayyadevara

Yeshwanth Reddy

View More author details

Table of Contents (26) Chapters

Preface

1. Section 1: Fundamentals of Deep Learning for Computer Vision

2. Artificial Neural Network Fundamentals FREE CHAPTER

3. PyTorch Fundamentals

4. Building a Deep Neural Network with PyTorch

5. Section 2: Object Classification and Detection

6. Introducing Convolutional Neural Networks

7. Transfer Learning for Image Classification

8. Practical Aspects of Image Classification

9. Basics of Object Detection

10. Advanced Object Detection

11. Image Segmentation

12. Applications of Object Detection and Segmentation

13. Section 3: Image Manipulation

14. Autoencoders and Image Manipulation

15. Image Generation Using GANs

16. Advanced GANs to Manipulate Images

17. Section 4: Combining Computer Vision with Other Techniques

18. Combining Computer Vision and Reinforcement Learning

19. Combining Computer Vision and NLP Techniques

20. Foundation Models in Computer Vision

21. Applications of Stable Diffusion

22. Moving a Model to Production

23. Other Books You May Enjoy

24. Index

Appendix

Summary

In this chapter, we began by learning about creating a training dataset for the process of object localization and detection. Then, we learned about SelectiveSearch, a region proposal technique that recommends regions based on the similarity of pixels in proximity. We also learned about calculating the IoU metric to understand the goodness of the predicted bounding box around the objects present in the image.

In addition, we looked at performing non-max suppression to fetch one bounding box per object within an image, before learning about building R-CNN and Fast R-CNN models from scratch. We also explored why R-CNN is slow and how Fast R-CNN leverages RoI pooling and fetches region proposals from feature maps to make inference faster. Finally, we understood that having region proposals coming from a separate model results in more time taken to predict on new images.

In the next chapter, we will learn about some of the modern object detection techniques that are used...