Large Language Models
The field of large language models (LLMs) has made significant progress in recent years with the development of models such as GPT-3 (175B), PaLM (540B), BLOOM (175B), LLaMA(65B), Falcon(180B), Mistral (7B), and many others. These models have shown impressive abilities in various natural language tasks. It may be challenging to cover such an important topic in a short book section. We have already covered many aspects of the topic, especially in Chapter 4. Moreover, throughout the book, we have discussed the paradigm of neural language models and their training process.
In this chapter, we discuss LLMs and run a couple of experiments. We will also show that it is possible to fine-tune LLMs using a similar process as in the previous chapters, with only minor differences.
We will discuss the following in detail:
- Why large language models?
- Fine-tuning large language models