Transformers, reformers, PET, or GPT?
Before using GPT models, we need to stop and look at transformers from a project management perspective at this point in our book's journey. Which model and which method must we choose for a given NLP project? Should we trust any of them? Once we consider cost management, accountability follows, and choosing a model and a machine become life-and-death decisions for a project. In this section, we will stop and think before entering the world of the recent GPT-2 and huge GPT-3 (and more may come) models.
We have successively gone through:
- The original architecture of the Transformer with an encoder and a decoder stack in Chapter 1, Getting Started with the Model Architecture of the Transformer.
- Fine-tuning a pretrained BERT model with only an encoder stack and no decoder stack in Chapter 2, Fine-Tuning BERT models.
- Training a RoBERTa-like model with only an encoder stack and no decoder stack in Chapter 3, Pretraining...