Summary
In this chapter, we have learned various aspects of AR language models, from pre-training to fine-tuning. We looked at the best features of such models by training generative language models and fine-tuning on tasks such as MT. We understood the basics of more complex models such as T5 and used this kind of model to perform MT. We also used the simpletransformers
library. We trained GPT-2 on our own corpus and generated text using it. We learned how to save it and use it with AutoModel
. We also had a deeper look into how BPE can be trained and used, using the tokenizers
library.
In the next chapter, we will see how to fine-tune models for text classification.