Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Deep Learning Essentials Your hands-on guide to the fundamentals of deep learning and neural network modeling

Product type Paperback

Published in Jan 2018

Publisher Packt

ISBN-13 9781785880360

Length 284 pages

Edition 1st Edition

Languages

Processing

Tools

Caffe

Concepts

Deep Learning

Authors (3):

Wei Di

Anurag Bhardwaj

Jianing Wei

View More author details

Table of Contents (12) Chapters

Preface

1. Why Deep Learning?

2. Getting Yourself Ready for Deep Learning FREE CHAPTER

3. Getting Started with Neural Networks

4. Deep Learning in Computer Vision

5. NLP - Vector Representation

6. Advanced Natural Language Processing

7. Multimodality

8. Deep Reinforcement Learning

9. Deep Learning Hacks

10. Deep Learning Trends

11. Other Books You May Enjoy

Leave a review – let other readers know what you think

Visual question answering

The task of visual question answering (VQA) is the task of answering an open-ended text question about a given image. VQA was proposed by Antol and its co-authors in 2015 (https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Antol_VQA_Visual_Question_ICCV_2015_paper.pdf). This task lies at the intersection of computer vision and natural language processing. It requires the understanding of the image and the parsing and understanding of the text question. Due to its multimodality nature and its well-defined quantitative evaluation metric, VQA is considered an important artificial intelligence task. It also has potential practical applications, including helping the visually impaired.

A few examples of the VQA task are illustrated in the following table:

Q: How many giraffes can be seen?

A: 2

Q: Is the bus door open?

A: Yes...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (3)

Wei Di is a data scientist with years of experience in machine learning and artificial intelligence. She is passionate about creating smart and scalable intelligent solutions that can impact millions of individuals and empower successful business. Currently, she works a staff data scientist in LinkedIn. She was previously associated with eBay Human Language Technology team and eBay Research Labs. Prior to that, she was with ancestry, working on large-scale data mining in the areas of record linkage. She received her PhD from Purdue University in 2011.

See other products by Di

Anurag Bhardwaj

Anurag Bhardwaj currently leads data science efforts at Wiser Solutions, where he focuses on structuring large scale eCommerce inventory. He is particularly interested in using machine learning to solve problems on product category classification, product matching and various related problems in eCommerce. Previously, he worked on image understanding at eBay Research Labs. Anurag received his PhD and MS from the State University of New York at Buffalo and holds a BTech in computer engineering from the National Institute of Technology, Kurukshetra, India.

See other products by Anurag Bhardwaj

Jianing Wei

Jianing Wei is a senior software engineer at Google Research. He works in the area of computer vision and computational imaging. Prior to joining Google in 2013, Jianing worked at Sony US Research Center for four years in the area of 3D computer vision and image processing. Jianing obtained his Ph.D. in Electrical and Computer Engineering from Purdue University in 2010.

See other products by Jianing Wei