An introduction to GLMs
While Autoencoder (AE) language models, such as Bidirectional Encoder Representations from Transformers (BERT), are purely based on the encoder part of transformers and are well-suited for classification problems, generative models are either encoder-decoder or decoder-only models. GLMs were originally intended for language generation tasks such as MT or text summarization, but we can see different successful use cases used by GLMs.
AE models are particularly effective for classification tasks, such as text classification and sentiment analysis, as well as token-level tasks such as named entity recognition (NER) or part-of-speech (POS) tagging. Here, AE models simply perform classification by mapping either the special CLS token at the beginning of the sentence or the individual tokens at any position to the predefined class labels. On the other hand, although GLMs have been widely used for language generation tasks, they have also been successfully used...