From traditional methods to LLMs
The traditional machine learning methods of NLP, where we use feature extraction methods such as bag-of-words and n-grams to extract numerical features and apply classic machine learning methods, such as a neural network or Support Vector Machines (SVMs), can be a powerful way to automate the recognition of patterns and execution of tasks, such as document classification or clustering. However, these baseline methods often can’t handle long sequences with variable-range dependencies. One solution for this was the development of recurrent neural networks (RNNs). However, RNNs are trained with backpropagation through time, which still makes the training of long sequences inefficient. The next generation of models was Transformers, which are more efficient models and are currently used as the basis for many sequence models and all LLMs on the market.
Transformers
Transformers are a special type of model that is based on the idea of self-attention...