Bringing your own data
So far, our exploration of LLMs has focused on a simple scenario where the app queries the model on the data the model was trained on. Models are updated periodically. Usually, we want to include more up-to-date information, such as web searches or data relevant to our solution domain. For example, if I am developing a legal application, I would want to include confidential customer legal documents and casework to make responses more relevant.
There are broadly two approaches to bringing in your own data:
- Fine-tuning, which applies transfer learning and further trains the model with our data. The following diagram summarizes fine-tuning:
Figure 13.4 – LLM fine-tuning
In the case of public LLMs such as ChatGPT, there are APIs to fine-tune a model. We will discuss fine-tuning in the next chapter, but needless to say, it brings its own risks that we have already mentioned when we talked about predictive AI, such...