Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Transformers for Natural Language Processing
Transformers for Natural Language Processing

Transformers for Natural Language Processing: Build, train, and fine-tune deep neural network architectures for NLP with Python, Hugging Face, and OpenAI's GPT-3, ChatGPT, and GPT-4 , Second Edition

Arrow left icon
Profile Icon Denis Rothman
Arrow right icon
NZ$105.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.8 (28 Ratings)
eBook Mar 2022 602 pages 2nd Edition
eBook
NZ$105.99
Paperback
NZ$131.99
Subscription
Free Trial
Arrow left icon
Profile Icon Denis Rothman
Arrow right icon
NZ$105.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.8 (28 Ratings)
eBook Mar 2022 602 pages 2nd Edition
eBook
NZ$105.99
Paperback
NZ$131.99
Subscription
Free Trial
eBook
NZ$105.99
Paperback
NZ$131.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Table of content icon View table of contents Preview book icon Preview Book

Transformers for Natural Language Processing

Getting Started with the Architecture of the Transformer Model

Language is the essence of human communication. Civilizations would never have been born without the word sequences that form language. We now mostly live in a world of digital representations of language. Our daily lives rely on NLP digitalized language functions: web search engines, emails, social networks, posts, tweets, smartphone texting, translations, web pages, speech-to-text on streaming sites for transcripts, text-to-speech on hotline services, and many more everyday functions.

Chapter 1, What are Transformers?, explained the limits of RNNs and the birth of cloud AI transformers taking over a fair share of design and development. The role of the Industry 4.0 developer is to understand the architecture of the original Transformer and the multiple transformer ecosystems that followed.

In December 2017, Google Brain and Google Research published the seminal Vaswani et al., Attention is All You Need paper...

The rise of the Transformer: Attention is All You Need

In December 2017, Vaswani et al. (2017) published their seminal paper, Attention is All You Need. They performed their work at Google Research and Google Brain. I will refer to the model described in Attention is All You Need as the “original Transformer model” throughout this chapter and book.

Appendix I, Terminology of Transformer Models, can help the transition from the classical usage of deep learning words to transformer vocabulary. Appendix I summarizes some of the changes to the classical AI definition of neural network models.

In this section, we will look at the structure of the Transformer model they built. In the following sections, we will explore what is inside each component of the model.

The original Transformer model is a stack of 6 layers. The output of layer l is the input of layer l+1 until the final prediction is reached. There is a 6-layer encoder stack on the left and...

Training and performance

The original Transformer was trained on a 4.5 million sentence pair English-German dataset and a 36 million sentence pair English-French dataset.

The datasets come from Workshops on Machine Translation (WMT), which can be found at the following link if you wish to explore the WMT datasets: http://www.statmt.org/wmt14/

The training of the original Transformer base models took 12 hours to train for 100,000 steps on a machine with 8 NVIDIA P100 GPUs. The big models took 3.5 days for 300,000 steps.

The original Transformer outperformed all the previous machine translation models with a BLEU score of 41.8. The result was obtained on the WMT English-to-French dataset.

BLEU stands for Bilingual Evaluation Understudy. It is an algorithm that evaluates the quality of the results of machine translations.

The Google Research and Google Brain team applied optimization strategies to improve the performance of the Transformer. For example, the Adam optimizer...

Tranformer models in Hugging Face

Everything you saw in this chapter can be condensed in to a ready-to-use Hugging Face transformer model.

With Hugging Face, you can implement machine translation in three lines of code!

Open Multi_Head_Attention_Sub_Layer.ipynb in Google Colaboratory. Save the notebook in your Google Drive (make sure you have a Gmail account). Go to the two last cells.

We first ensure that Hugging Face transformers are installed:

!pip -q install transformers

The first cell imports the Hugging Face pipeline that contains several transformer usages:

#@title Retrieve pipeline of modules and choose English to French translation
from transformers import pipeline

We then implement the Hugging Face pipeline, which contains ready-to-use functions. In our case, to illustrate the Transformer model of this chapter, we activate the translator model and enter a sentence to translate from English to French:

translator = pipeline("translation_en_to_fr...

Summary

In this chapter, we first got started by examining the mind-blowing long-distance dependencies transformer architectures can uncover. Transformers can perform transductions from written and oral sequences to meaningful representations as never before in the history of Natural Language Understanding (NLU).

These two dimensions, the expansion of transduction and the simplification of implementation, are taking artificial intelligence to a level never seen before.

We explored the bold approach of removing RNNs, LSTMs, and CNNs from transduction problems and sequence modeling to build the Transformer architecture. The symmetrical design of the standardized dimensions of the encoder and decoder makes the flow from one sublayer to another nearly seamless.

We saw that beyond removing recurrent network models, transformers introduce parallelized layers that reduce training time. We discovered other innovations, such as positional encoding and masked multi-headed attention...

Questions

  1. NLP transduction can encode and decode text representations. (True/False)
  2. Natural Language Understanding (NLU) is a subset of Natural Language Processing (NLP). (True/False)
  3. Language modeling algorithms generate probable sequences of words based on input sequences. (True/False)
  4. A transformer is a customized LSTM with a CNN layer. (True/False)
  5. A transformer does not contain LSTM or CNN layers. (True/False)
  6. Attention examines all the tokens in a sequence, not just the last one. (True/False)
  7. A transformer uses a positional vector, not positional encoding. (True/False)
  8. A transformer contains a feedforward network. (True/False)
  9. The masked multi-headed attention component of the decoder of a transformer prevents the algorithm parsing a given position from seeing the rest of a sequence that is being processed. (True/False)
  10. Transformers can analyze long-distance dependencies better than LSTMs. (True/False)

References

Join our book’s Discord space

Join the book...

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Improve your productivity with OpenAI’s ChatGPT and GPT-4 from prompt engineering to creating and analyzing machine learning models
  • Pretrain a BERT-based model from scratch using Hugging Face
  • Fine-tune powerful transformer models, including OpenAI's GPT-3, to learn the logic of your data

Description

Transformers are...well...transforming the world of AI. There are many platforms and models out there, but which ones best suit your needs? Transformers for Natural Language Processing, 2nd Edition, guides you through the world of transformers, highlighting the strengths of different models and platforms, while teaching you the problem-solving skills you need to tackle model weaknesses. You'll use Hugging Face to pretrain a RoBERTa model from scratch, from building the dataset to defining the data collator to training the model. If you're looking to fine-tune a pretrained model, including GPT-3, then Transformers for Natural Language Processing, 2nd Edition, shows you how with step-by-step guides. The book investigates machine translations, speech-to-text, text-to-speech, question-answering, and many more NLP tasks. It provides techniques to solve hard language problems and may even help with fake news anxiety (read chapter 13 for more details). You'll see how cutting-edge platforms, such as OpenAI, have taken transformers beyond language into computer vision tasks and code creation using DALL-E 2, ChatGPT, and GPT-4. By the end of this book, you'll know how transformers work and how to implement them and resolve issues like an AI detective.

Who is this book for?

If you want to learn about and apply transformers to your natural language (and image) data, this book is for you. You'll need a good understanding of Python and deep learning and a basic understanding of NLP to benefit most from this book. Many platforms covered in this book provide interactive user interfaces, which allow readers with a general interest in NLP and AI to follow several chapters. And don't worry if you get stuck or have questions; this book gives you direct access to our AI/ML community to help guide you on your transformers journey!

What you will learn

  • Discover new techniques to investigate complex language problems
  • Compare and contrast the results of GPT-3 against T5, GPT-2, and BERT-based transformers
  • Carry out sentiment analysis, text summarization, casual speech analysis, machine translations, and more using TensorFlow, PyTorch, and GPT-3
  • Find out how ViT and CLIP label images (including blurry ones!) and create images from a sentence using DALL-E
  • Learn the mechanics of advanced prompt engineering for ChatGPT and GPT-4

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Mar 25, 2022
Length: 602 pages
Edition : 2nd
Language : English
ISBN-13 : 9781803243481
Category :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning

Product Details

Publication date : Mar 25, 2022
Length: 602 pages
Edition : 2nd
Language : English
ISBN-13 : 9781803243481
Category :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just NZ$7 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just NZ$7 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total NZ$ 286.97
Machine Learning with PyTorch and Scikit-Learn
NZ$80.99
Deep Learning with TensorFlow and Keras – 3rd edition
NZ$73.99
Transformers for Natural Language Processing
NZ$131.99
Total NZ$ 286.97 Stars icon

Table of Contents

19 Chapters
What are Transformers? Chevron down icon Chevron up icon
Getting Started with the Architecture of the Transformer Model Chevron down icon Chevron up icon
Fine-Tuning BERT Models Chevron down icon Chevron up icon
Pretraining a RoBERTa Model from Scratch Chevron down icon Chevron up icon
Downstream NLP Tasks with Transformers Chevron down icon Chevron up icon
Machine Translation with the Transformer Chevron down icon Chevron up icon
The Rise of Suprahuman Transformers with GPT-3 Engines Chevron down icon Chevron up icon
Applying Transformers to Legal and Financial Documents for AI Text Summarization Chevron down icon Chevron up icon
Matching Tokenizers and Datasets Chevron down icon Chevron up icon
Semantic Role Labeling with BERT-Based Transformers Chevron down icon Chevron up icon
Let Your Data Do the Talking: Story, Questions, and Answers Chevron down icon Chevron up icon
Detecting Customer Emotions to Make Predictions Chevron down icon Chevron up icon
Analyzing Fake News with Transformers Chevron down icon Chevron up icon
Interpreting Black Box Transformer Models Chevron down icon Chevron up icon
From NLP to Task-Agnostic Transformer Models Chevron down icon Chevron up icon
The Emergence of Transformer-Driven Copilots Chevron down icon Chevron up icon
The Consolidation of Suprahuman Transformers with OpenAI’s ChatGPT and GPT-4 Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.8
(28 Ratings)
5 star 53.6%
4 star 17.9%
3 star 3.6%
2 star 3.6%
1 star 21.4%
Filter icon Filter
Top Reviews

Filter reviews by




Gabe Rigall Apr 05, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The media could not be loaded. BLUF: This intermediate-to-advanced text provides a no-holds-barred introduction to transformer architecture and application for NLP (and other) tasks. If you're looking to up your NLP game, this book is for you.PROS:- Helpful background information for computer scientists and data wizards alike- Plenty of graphics and in-depth explanations of the "black boxes" of transformers; especially helpful for the visual learner- Good mix of theory and application with a focus on the latterCONS:- Fairly advanced text; assumes the reader has a certain breadth of subject matter knowledge (definitely not for beginners)
Amazon Verified review Amazon
dr t May 04, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Transformer models have powered recent NLP developments and have completely changed the way NLP problems are now approached. Rothman believes that Industry 4.0 professionals need to be aware of multiple approaches and understand that each has its own pros and cons. However, this book is not designed to explain every single transformer model out there. Instead, it tries to explain enough so that readers have enough knowledge to know how to tackle an NLP problem.There is much to like about this book. The book has 16 chapters and begins by explaining transformers, and exploring interesting ideas such as whether programming is now becoming a sub-domain of NLP. There are also fantastic, and very useful, practical examples on how to work with a Bert tokeniser, conditioning a GPT-2 model, question-answering, pre-training RoBERTa models from scratch, and training a tokeniser. The running NLP tasks online section is also useful. Given this readers own background, the chapter on detecting customer emotions to make predictions was fascinating, and frankly left this reader wanting more!. In todays world where XAI is vitally important, the chapter on interpreting black box transformer models, and the section on using BertViz to show visualisations of the activity of transformer models are key to understanding how models work and to interpret model behaviour.This is one of those books where reading is fine, and the questions section at end of each chapter is useful, but one must really do to gain maximum benefit. The book contain lots of Python code to follow but, to be clear, this book is not geared towards Python beginners - there are plenty of great Packt Python beginners books out there already - hence knowledge of NLP and Python is mandatory.In summary, this book cements Rothman's position as the #1 authority on Transformers, and is absolutely the go-to book for Transformers. Highly recommended and a must-buy for any serious NLP practitioner.
Amazon Verified review Amazon
Prateek M. Jul 11, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
for NLP practitioners who do not want to skim through the entire internet to look for any concept, this is a very good book as it is concise and good in both theory as well as code
Amazon Verified review Amazon
Derek Martin Apr 06, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I've read anything I can get my hands on re: GPT-3/Huggingface. Prof Rothman has raised the ante for what should be considered acceptable discourse re: GPT-3/Transformers and the new NLP-driven world that we live in.Buy the book; it has a feel of an experienced mentor giving you the tools but more importantly, the judgement to navigate these new waters.
Amazon Verified review Amazon
hawkinflight Mar 25, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book answers the questions: What are transformers? Why do we care? What do they help us do? Why and how are they better?The 16 chapters cover a lot of material - the author explains what transformers are and how they have evolved, discusses tools and methods that can be used to interpret black box transformer models, and presents applications of transformer models to a variety of tasks, mostly in natural language processing/understanding, but also in computer vision. It's very informative to read about the evolution, and nice to learn the areas where there are limitations. The generalization that has occurred is described.It's great that the book provides interactive ways of learning. There are hands-on coding opportunities via Google Colaboratory notebooks, and questions at the end of each chapter.It's fascinating to read that OpenAI trained a 175 billion parameter GPT-3 transformer model to be run on a supercomputer with 10k GPUs and 285k CPU cores.The book mentions Industry 4.0, the Fourth Industrial Revolution, that is, building on top of the digital revolution, and connecting everything to everything, everywhere. Transformers are described as filling a gap left by the limitations of Recurrent Neural Networks (RNNs) for automation in a fast-moving world.It's also interesting to read that transformers mark a new generation of ready-to-use artificial intelligence models; for example, Hugging Face and Google Brain provide AI with a few lines of code.It is mentioned that "learning all of the models is impossible, however, by deepening our knowledge of transformer models, we can understand a new model quickly."The final chapter discusses AI Copilots and machine interconnectivity to speedup transactions and boost productivity. It says, for example, that Microsoft and OpenAI have collaborated and produced a copilot that can write Python code with humans or for humans.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.