You're reading from Transformers for Natural Language Processing Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more

Product type Paperback

Published in Jan 2021

Publisher Packt

ISBN-13 9781800565791

Length 384 pages

Edition 1st Edition

Languages

Processing

Tools

BERT

Concepts

Mobile Application Development

Author (1):

Denis Rothman

View More author details

Table of Contents (16) Chapters

Preface

1. Getting Started with the Model Architecture of the Transformer

2. Fine-Tuning BERT Models FREE CHAPTER

3. Pretraining a RoBERTa Model from Scratch

4. Downstream NLP Tasks with Transformers

5. Machine Translation with the Transformer

6. Text Generation with OpenAI GPT-2 and GPT-3 Models

7. Applying Transformers to Legal and Financial Documents for AI Text Summarization

8. Matching Tokenizers and Datasets

9. Semantic Role Labeling with BERT-Based Transformers

10. Let Your Data Do the Talking: Story, Questions, and Answers

11. Detecting Customer Emotions to Make Predictions

12. Analyzing Fake News with Transformers

13. Other Books You May Enjoy

14. Index

Appendix: Answers to the Questions

Summary

In this chapter, we built KantaiBERT, a RoBERTa-like model transformer, from scratch using the construction blocks provided by Hugging Face.

We first started by loading a customized dataset on a specific topic related to the works of Immanuel Kant. You can load an existing dataset or create your own depending on your goals. We saw that using a customized dataset provides insights into the way a transformer model thinks. However, this experimental approach has its limits. It would take a much larger dataset to train a model beyond educational purposes.

The KantaiBERT project was used to train a tokenizer on the kant.txt dataset. The trained merges.txt and vocab.json files were saved. A tokenizer was recreated with our pretrained files. KantaiBERT built the customized dataset and defined a data collator to process the training batches for backpropagation. The trainer was initialized, and we explored the parameters of the RoBERTa model in detail. The model was trained...