Understand the inner workings of various neural network architectures and their implementation, including image classification, object detection, segmentation, generative adversarial networks, transformers, and diffusion models
Build solutions for real-world computer vision problems using PyTorch
All the code files are available on GitHub and can be run on Google Colab
Description
Whether you are a beginner or are looking to progress in your computer vision career, this book guides you through the fundamentals of neural networks (NNs) and PyTorch and how to implement state-of-the-art architectures for real-world tasks.
The second edition of Modern Computer Vision with PyTorch is fully updated to explain and provide practical examples of the latest multimodal models, CLIP, and Stable Diffusion.
You’ll discover best practices for working with images, tweaking hyperparameters, and moving models into production. As you progress, you'll implement various use cases for facial keypoint recognition, multi-object detection, segmentation, and human pose detection. This book provides a solid foundation in image generation as you explore different GAN architectures. You’ll leverage transformer-based architectures like ViT, TrOCR, BLIP2, and LayoutLM to perform various real-world tasks and build a diffusion model from scratch. Additionally, you’ll utilize foundation models' capabilities to perform zero-shot object detection and image segmentation. Finally, you’ll learn best practices for deploying a model to production.
By the end of this deep learning book, you'll confidently leverage modern NN architectures to solve real-world computer vision problems.
Who is this book for?
This book is for beginners to PyTorch and intermediate-level machine learning practitioners who want to learn computer vision techniques using deep learning and PyTorch. It's useful for those just getting started with neural networks, as it will enable readers to learn from real-world use cases accompanied by notebooks on GitHub. Basic knowledge of the Python programming language and ML is all you need to get started with this book. For more experienced computer vision scientists, this book takes you through more advanced models in the latter part of the book.
What you will learn
Get to grips with various transformer-based architectures for computer vision, CLIP, Segment-Anything, and Stable Diffusion, and test their applications, such as in-painting and pose transfer
Combine CV with NLP to perform OCR, key-value extraction from document images, visual question-answering, and generative AI tasks
Implement multi-object detection and segmentation
Leverage foundation models to perform object detection and segmentation without any training data points
Learn best practices for moving a model to production
An insightful and comprehensive guide that not only demystifies modern computer vision techniques but also includes practical code examples for real-world implementation. A must-read for both beginners and seasoned professionals looking to stay ahead in the field!
Feefo Verified review
Amazon CustomerAug 29, 2024
5
In today's fast-paced tech landscape, understanding the 'why' behind your actions is crucial, and this book excels in teaching that. It not only explains what needs to be done in various scenarios but also explains why these steps are necessary.The book further supports learning with hands-on code examples and thorough explanations of each code block, bridging the gap between theory and practical application seamlessly.
Amazon Verified review
Reeti PandeyJul 28, 2024
5
Modern Computer Vision with PyTorch" is an indispensable resource for anyone looking to dive deep into the world of computer vision and deep learning. Authors V Kishore Ayyadevara and Yeshwanth Reddy have meticulously crafted a comprehensive guide that covers both fundamental concepts and advanced applications, making it an excellent choice for both beginners and experienced practitioners.The book's detailed coverage is one of its standout features. Starting with the basics of artificial neural networks, it gradually builds up to more complex topics like convolutional neural networks, transfer learning, and advanced object detection techniques. Each chapter is well-structured, providing clear explanations, practical examples, and code snippets that help solidify understanding. For instance, the chapters on PyTorch fundamentals and building deep neural networks are particularly useful for those new to the library, offering a thorough grounding in its capabilities and syntax.Moreover, the book doesn't just stop at the theory. It includes hands-on projects and real-world applications, such as image classification, object detection, and image segmentation. This practical approach ensures that readers can directly apply what they've learned to real-life scenarios. The sections on using advanced architectures like VGG16, ResNet, and YOLO are especially noteworthy, as they offer insights into cutting-edge techniques in computer vision.The inclusion of modern advancements such as Generative AI and practical aspects of image classification further enriches the content, making it relevant to today's fast-evolving AI landscape. The authors also do an excellent job of addressing practical challenges, such as handling imbalanced data and optimizing model performance, which are crucial for developing robust computer vision applications.
Amazon Verified review
dr tJun 22, 2024
5
First things first, this is a substantial book, spanning over 700 pages across 18 chapters.The book is promoted as a bridge between academia and practical applications and is aimed towards newbies and intermediate readers - and by and large, it delivers. Starting with an introduction to neural networks and an introduction to PyTorch, these are then combined and the reader is guided through building a neural network using PyTorch.Key computer vision concepts such as CNNs, object detection, and segmentation each get their own chapter. In the middle section, the book moves on to autoencoders, GANs, and reinforcement learning. However, this reader found the chapter on combining CV and NLP techniques particularly fascinating. Chapter 15 discusses vision transformers and their application in OCR – truly intriguing stuff. The book concludes with arguably the most useful chapter on deploying a model to production, covering creating APIs, containerisation, and running containers in the cloud. Highly, highly useful.Each chapter includes Python exercises and test-yourself questions at the end. As usual with Packt books, the book is well-written and thoroughly covers the subject matter in a clear and accessible manner.Highly recommended.
Amazon Verified review
AdityaJun 20, 2024
5
I really enjoyed how this book makes the connection between NLP and computer vision easy to understand. The book provides detailed explanation of how transformers & diffusion models work with multiple examples. It also covers a deep under-the-hood detail of how different blocks of these models work.The explanations made it easy for me to connect multiple dots and gain a strong intuition of Generative AI. The additional topics on traditional computer vision tasks make the book highly resourceful.
Kishore Ayyadevara is an entrepreneur and a hands-on leader working at the intersection of technology, data, and AI to identify and solve business problems. With over a decade of experience in leadership roles, Kishore has established and grown successful applied data science teams at American Express and Amazon, as well as a top health insurance company. In his current role, he is building a start-up focused on making AI more accessible to healthcare organizations. Outside of work, Kishore has shared his knowledge through his five books on ML/AI, is an inventor with 12 patents, and has been a speaker at multiple AI conferences.
Yeshwanth Reddy is a highly accomplished data scientist manager with 9+ years of experience in deep learning and document analysis. He has made significant contributions to the field, including building software for end-to-end document digitization, resulting in substantial cost savings. Yeshwanth's expertise extends to developing modules in OCR, word detection, and synthetic document generation. His groundbreaking work has been recognized through multiple patents. He has also created a few Python libraries. With a passion for disrupting unsupervised and self-supervised learning, Yeshwanth is dedicated to reducing reliance on manual annotation and driving innovative solutions in the field of data science.
Economy: Delivery to most addresses in the US within 10-15 business days
Premium: Trackable Delivery to most addresses in the US within 3-8 business days
UK:
Economy: Delivery to most addresses in the U.K. within 7-9 business days. Shipments are not trackable
Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days! Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands
EU:
Premium: Trackable delivery to most EU destinations within 4-9 business days.
Australia:
Economy: Can deliver to P. O. Boxes and private residences. Trackable service with delivery to addresses in Australia only. Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro Delivery time is up to 15 business days for remote areas of WA, NT & QLD.
Premium: Delivery to addresses in Australia only Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.
India:
Premium: Delivery to most Indian addresses within 5-6 business days
Rest of the World:
Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days
Asia:
Premium: Delivery to most Asian addresses within 5-9 business days
Disclaimer: All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.
Unfortunately, due to several restrictions, we are unable to ship to the following countries:
Afghanistan
American Samoa
Belarus
Brunei Darussalam
Central African Republic
The Democratic Republic of Congo
Eritrea
Guinea-bissau
Iran
Lebanon
Libiya Arab Jamahriya
Somalia
Sudan
Russian Federation
Syrian Arab Republic
Ukraine
Venezuela
What is custom duty/charge?
Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.
Do I have to pay customs charges for the print book order?
The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.
A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.
How do I know my custom duty charges?
The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.
For example:
If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order?
Cancellation Policy for Published Printed Books:
You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.
Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.
What is your returns and refunds policy?
Return Policy:
We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:
If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.
On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.
What tax is charged?
Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.
What payment methods can I use?
You can pay with the following card types:
Visa Debit
Visa Credit
MasterCard
PayPal
What is the delivery time and cost of print books?
Shipping Details
USA:
'
Economy: Delivery to most addresses in the US within 10-15 business days
Premium: Trackable Delivery to most addresses in the US within 3-8 business days
UK:
Economy: Delivery to most addresses in the U.K. within 7-9 business days. Shipments are not trackable
Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days! Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands
EU:
Premium: Trackable delivery to most EU destinations within 4-9 business days.
Australia:
Economy: Can deliver to P. O. Boxes and private residences. Trackable service with delivery to addresses in Australia only. Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro Delivery time is up to 15 business days for remote areas of WA, NT & QLD.
Premium: Delivery to addresses in Australia only Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.
India:
Premium: Delivery to most Indian addresses within 5-6 business days
Rest of the World:
Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days
Asia:
Premium: Delivery to most Asian addresses within 5-9 business days
Disclaimer: All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.
Unfortunately, due to several restrictions, we are unable to ship to the following countries: