Advanced Regularization in Natural Language Processing
A full book could be written about regularization in natural language processing (NLP). NLP is a wide field that consists of many topics, ranging from simple classification such as review ranking to complex models and solutions such as ChatGPT. This chapter will merely scratch the surface of what can reasonably be done with simple NLP solutions such as classification.
In this chapter, we will cover the following recipes:
- Regularization using a word2vec embedding
- Data augmentation using word2vec
- Zero-shot inference with pre-trained models
- Regularization with BERT embeddings
- Data augmentation using GPT-3
By the end of this chapter, you will be able to take advantage of advanced methods for NLP tasks such as word embeddings and transformers, as well as be able to use data augmentation to generate synthetic training data.