Summary
In this chapter, we have covered various aspects of GLMs (AR and seq2seq), from pre-training to fine-tuning. We looked at the best features of such models by training GLMs and fine-tuning them on tasks such as MT and multi-task training. We explored the basics of more complex models such as T5, GPT3, and T0 and used this kind of model to perform MT. We used different Python libraries to train T5-based multi-task learning prototyping. We trained GPT-2 on our own corpus and generated text using it. We learned how to save it and use it with AutoModel. We also had a deeper look into how BPE can be trained and used, using the tokenizers library.
In the next chapter, we will see how to fine-tune models for text classification.