You're reading from Mastering Transformers Build state-of-the-art models from scratch with advanced natural language processing techniques

Product type Paperback

Published in Sep 2021

Publisher Packt

ISBN-13 9781801077651

Length 374 pages

Edition 1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Mobile Application Development

Authors (2):

Savaş Yıldırım

Meysam Asgari- Chenaghlu

View More author details

Table of Contents (16) Chapters

Preface

1. Section 1: Introduction – Recent Developments in the Field, Installations, and Hello World Applications

2. Chapter 1: From Bag-of-Words to the Transformer FREE CHAPTER

3. Chapter 2: A Hands-On Introduction to the Subject

4. Section 2: Transformer Models – From Autoencoding to Autoregressive Models

5. Chapter 3: Autoencoding Language Models

6. Chapter 4:Autoregressive and Other Language Models

7. Chapter 5: Fine-Tuning Language Models for Text Classification

8. Chapter 6: Fine-Tuning Language Models for Token Classification

9. Chapter 7: Text Representation

10. Section 3: Advanced Topics

11. Chapter 8: Working with Efficient Transformers

12. Chapter 9:Cross-Lingual and Multilingual Language Modeling

13. Chapter 10: Serving Transformer Models

14. Chapter 11: Attention Visualization and Experiment Tracking

15. Other Books You May Enjoy

Chapter 1: From Bag-of-Words to the Transformer

In this chapter, we will discuss what has changed in Natural Language Processing (NLP) over two decades. We experienced different paradigms and finally entered the era of Transformer architectures. All the paradigms help us to gain a better representation of words and documents for problem-solving. Distributional semantics describes the meaning of a word or a document with vectorial representation, looking at distributional evidence in a collection of articles. Vectors are used to solve many problems in both supervised and unsupervised pipelines. For language-generation problems, n-gram language models have been leveraged as a traditional approach for years. However, these traditional approaches have many weaknesses that we will discuss throughout the chapter.

We will further discuss classical Deep Learning (DL) architectures such as Recurrent Neural Networks (RNNs), Feed-Forward Neural Networks (FFNNs), and Convolutional Neural Networks (CNNs). These have improved the performance of the problems in the field and have overcome the limitation of traditional approaches. However, these models have had their own problems too. Recently, Transformer models have gained immense interest because of their effectiveness in all NLP tasks, from text classification to text generation. However, the main success has been that Transformers effectively improve the performance of multilingual and multi-task NLP problems, as well as monolingual and single tasks. These contributions have made Transfer Learning (TL) more possible in NLP, which aims to make models reusable for different tasks or different languages.

Starting with the attention mechanism, we will briefly discuss the Transformer architecture and the differences between previous NLP models. In parallel with theoretical discussions, we will show practical examples with the popular NLP framework. For the sake of simplicity, we will choose introductory code examples that are as short as possible.

In this chapter, we will cover the following topics:

Evolution of NLP toward Transformers
Understanding distributional semantics
Leveraging DL
Overview of the Transformer architecture
Using TL with Transformers

You're reading from Mastering Transformers Build state-of-the-art models from scratch with advanced natural language processing techniques

Table of Contents (16) Chapters Close

Chapter 1: From Bag-of-Words to the Transformer

Authors (2)

Other recommended products

Table of Contents (16) Chapters