You're reading from Hands-On Natural Language Processing with PyTorch 1.x Build smart, AI-driven linguistic applications using deep learning and NLP techniques

Product type Paperback

Published in Jul 2020

Publisher Packt

ISBN-13 9781789802740

Length 276 pages

Edition 1st Edition

Languages

Processing

Tools

Processing

Concepts

Deep Learning

Author (1):

Thomas Dop

View More author details

Table of Contents (14) Chapters

Preface

1. Section 1: Essentials of PyTorch 1.x for NLP

2. Chapter 1: Fundamentals of Machine Learning and Deep Learning FREE CHAPTER

3. Chapter 2: Getting Started with PyTorch 1.x for NLP

4. Section 2: Fundamentals of Natural Language Processing

5. Chapter 3: NLP and Text Embeddings

6. Chapter 4: Text Preprocessing, Stemming, and Lemmatization

7. Section 3: Real-World NLP Applications Using PyTorch 1.x

8. Chapter 5: Recurrent Neural Networks and Sentiment Analysis

9. Chapter 6: Convolutional Neural Networks for Text Classification

10. Chapter 7: Text Translation Using Sequence-to-Sequence Neural Networks

11. Chapter 8: Building a Chatbot Using Attention-Based Neural Networks

12. Chapter 9: The Road Ahead

13. Other Books You May Enjoy

Leave a review - let other readers know what you think

The theory of attention within neural networks

In the previous chapter, in our sequence-to-sequence model for sentence translation (with no attention implemented), we used both encoders and decoders. The encoder obtained a hidden state from the input sentence, which was a representation of our sentence. The decoder then used this hidden state to perform the translation steps. A basic graphical illustration of this is as follows:

Figure 8.1 – Graphical representation of sequence-to-sequence models

However, decoding over the entirety of the hidden state is not necessarily the most efficient way of using this task. This is because the hidden state represents the entirety of the input sentence; however, in some tasks (such as predicting the next word in a sentence), we do not need to consider the entirety of the input sentence, just the parts that are relevant to the prediction we are trying to make. We can show that by using attention within our sequence...