Common pitfalls: dos and don’ts
In this section, we will give five dos and a few don’ts that are typically recommended when dealing with transformers.
Dos
Let’s start with recommended best practices:
- Do use pretrained large models. Today, it is almost always convenient to start from an already available pretrained model such as T5, instead of training your transformer from scratch. If you use a pretrained model, you for sure stand on the giant’s shoulders; think about it!
- Do start with few-shot learning. When you start working with transformers, it’s always a good idea to start with a pretrained model and then perform a lightweight few-shot learning step. Generally, this would improve the quality of results without high computational costs.
- Do use fine-tuning on your domain data and on your customer data. After playing with pretraining models and few-shot learning, you might consider doing a proper fine-tuning on your...