You're reading from Transformers for Natural Language Processing Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more

Product type Paperback

Published in Jan 2021

Publisher Packt

ISBN-13 9781800565791

Length 384 pages

Edition 1st Edition

Languages

Processing

Tools

BERT

Concepts

Mobile Application Development

Author (1):

Denis Rothman

View More author details

Table of Contents (16) Chapters

Preface

1. Getting Started with the Model Architecture of the Transformer

2. Fine-Tuning BERT Models FREE CHAPTER

3. Pretraining a RoBERTa Model from Scratch

4. Downstream NLP Tasks with Transformers

5. Machine Translation with the Transformer

6. Text Generation with OpenAI GPT-2 and GPT-3 Models

7. Applying Transformers to Legal and Financial Documents for AI Text Summarization

8. Matching Tokenizers and Datasets

9. Semantic Role Labeling with BERT-Based Transformers

10. Let Your Data Do the Talking: Story, Questions, and Answers

11. Detecting Customer Emotions to Make Predictions

12. Analyzing Fake News with Transformers

13. Other Books You May Enjoy

14. Index

Appendix: Answers to the Questions

Training and performance

The original Transformer was trained on a 4.5-million-sentence-pair English-German dataset and a 36-million-sentence English-French dataset.

The datasets come from Workshops on Machine Translation (WMT), which can be found at the following link if you wish to explore the WMT datasets: http://www.statmt.org/wmt14/

The training of the original Transformer base models took 12 hours to train for 100,000 steps on a machine with 8 NVIDIA P100 GPUs. The big models took 3.5 days for 300,000 steps.

The original Transformer outperformed all the previous machine translation models with a BLEU score of 41.8. The result was obtained on the WMT English-to-French dataset.

BLEU stands for Bilingual Evaluation Understudy. It is an algorithm that evaluates the quality of the results of machine translations.

The Google Research and Google Brain team applied optimization strategies to improve the performance of the Transformer. For example, the Adam optimizer...