Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Generative AI with Python and TensorFlow 2

You're reading from   Generative AI with Python and TensorFlow 2 Create images, text, and music with VAEs, GANs, LSTMs, Transformer models

Arrow left icon
Product type Paperback
Published in Apr 2021
Publisher Packt
ISBN-13 9781800200883
Length 488 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Raghav Bali Raghav Bali
Author Profile Icon Raghav Bali
Raghav Bali
Joseph Babcock Joseph Babcock
Author Profile Icon Joseph Babcock
Joseph Babcock
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. An Introduction to Generative AI: "Drawing" Data from Models 2. Setting Up a TensorFlow Lab FREE CHAPTER 3. Building Blocks of Deep Neural Networks 4. Teaching Networks to Generate Digits 5. Painting Pictures with Neural Networks Using VAEs 6. Image Generation with GANs 7. Style Transfer with GANs 8. Deepfakes with GANs 9. The Rise of Methods for Text Generation 10. NLP 2.0: Using Transformers to Generate Text 11. Composing Music with Generative Models 12. Play Video Games with Generative AI: GAIL 13. Emerging Applications in Generative AI 14. Other Books You May Enjoy
15. Index

Attention

The LSTM-based architecture we used to prepare our first language model for text generation had one major limitation. The RNN layer (generally speaking, it could be LSTM, or GRU, etc.) takes in a context window of a defined size as input and encodes all of it into a single vector. This bottleneck vector needs to capture a lot of information in itself before the decoding stage can use it to start generating the next token.

Attention is one of the most powerful concepts in the deep learning space that really changed the game. The core idea behind the attention mechanism is to make use of all interim hidden states of the RNN to decide which one to focus upon before it is used by the decoding stage. A more formal way of presenting attention is:

Given a vector of values (all the hidden states of the RNN) and a query vector (this could be the decoder state), attention is a technique to compute a weighted sum of the values, dependent on the query.

The weighted sum acts...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime