A change in the paradigm
It feels like eons ago in tech years, but let’s rewind just a couple of years, back when if you were embarking on solving an AI problem, you couldn’t default to utilizing a pre-trained model through the web or a managed endpoint. The process was meticulous – you’d have to first clearly define the specific use case, identify what data you had available and could collect to train a custom model, select the appropriate algorithm and model architecture, train the model using specialized hardware and software, and validate if the outputs would actually help solve the task at hand. If all went well, you would have a model that would take a predefined input and also provide a predefined output.
The paradigm profoundly shifted with the advent of LLMs and large multimodal models. Suddenly, you could access a pre-trained model with billions of parameters and start experimenting right off the bat with these versatile foundational models where the inputs and outputs are dynamic in nature. After tinkering around, you’d then evaluate if any fine-tuning is necessary to adapt the model to your needs, rather than pre-training an entire model from scratch. And spoiler alert – in most cases, chances are you won’t even need to fine-tune a foundational model.
Another key shift relates to the early belief that one model would outperform all others and solve all tasks. However, the model itself is just the engine; you still need an entire ecosystem packaged together to provide a complete solution. Foundational models have certainly demonstrated some incredible capabilities beyond initial expectations. But we also observe that certain models are better suited for certain tasks. And running the same prompt through other models can produce very different outputs depending on the underlying model’s training datasets and architecture.
So, the new experimental path often focuses first on prompt engineering, response evaluation, and then fine-tuning the foundational model if gaps exist. This contrasts sharply with the previous flow of data prep, training, and experimentation before you could get your hands dirty. The bar to start creating with AI has never been lower.
In the following sections, we will explore the difference between the development lifecycle of predictive AI and generative AI use cases. In each section, we have provided a high-level visual representation of a simplified development lifecycle and an explanation of the thought process behind each approach.
Predictive AI use case development – simplified lifecycle
Figure 1.1: Predictive AI use case development simplified lifecycle
Let’s dive into the process of developing a predictive AI model first. Everything starts with a good use case, and ROI (return on investment) is top of mind when evaluating AI use cases. Think about pain points in your business or industry that could be solved by predicting an outcome. It is very important to always keep an eye on feasibility – whether you can procure the data you need, etc.
Once you’ve landed on a compelling value-driven use case, next up is picking algorithms. You’ve got endless options here – decision trees, neural nets, regressions, random forests, and on and on. It is very important not to be swayed by the bias for the latest and greatest and to focus on the core requirements of your data and use case to narrow the options down. You can always switch it up or add additional experiments as you iterate through your testing.
With a plan in place, now it is time to get your hands dirty with the data. Identifying sources, cleaning things up, and carrying out feature engineering is an art and, more often than not, the key to improving your model’s results. There is no shortcut for rigor here, unfortunately! Garbage in, garbage out, as they say. But once you’ve wrangled datasets you can rely on, then comes the fun part.
It’s time to work with your model. Define your evaluation process upfront, split data wisely, and start training various configurations. Don’t forget to monitor and tune based on validation performance. Then, once you’ve got your golden model, implement robust serving infrastructure so it scales without a hitch.
But wait, not so fast! Testing doesn’t end when models are in production. Collect performance data continuously, monitor for concept drifts, and retrain when needed. A solid predictive model requires ongoing feedback mechanisms, as shown via the arrow connecting Model Enhancement to Testing in Figure 1.1. There is no such thing as set and forget in this space.
Generative AI use case development – simplified lifecycle
Figure 1.2: Generative AI use case development simplified lifecycle
The process of generative AI use case development is similar but not the same as in predictive AI; there are some common steps, but the order of tasks is different.
The first step is the ideation of potential use cases. This selection needs to be balanced with business needs as satisfying them is our main objective.
With a clear problem definition in place, extensive analysis of published model benchmarks often informs the selection of a robust foundational model best suited for the task. In this step, it is worth asking ourselves the question is this use case better suited for a predictive model?
As foundational models provide capabilities out of the box, initial testing comes as a step early in the process. A structured testing methodology helps reveal innate strengths, weaknesses, and quirks of a specific model. Both quantitative metrics and qualitative human evaluations fuel iterative improvement throughout the full development lifecycle.
The next step is to move to the art of prompt engineering. Prompting is the mechanism used to interact with LLMs. Techniques like chain-of-thought prompting, skeleton prompts, and retrieval augmentation build guardrails enabling more consistent, logical outputs.
If gaps remain after prompt optimization, model enhancement via fine-tuning and distillation offers a precision tool to adapt models closer to the target task.
In rare cases, pretraining a fully custom model from scratch is warranted when no existing model can viably serve the use case. However, it is important to keep in mind that due to the massive data requirements posed by model retraining, this task won’t be suitable for most use cases and teams; retraining a foundational model requires an extensive amount of data and processing power that makes the process unpractical from a financial and technical perspective.
Above all, the interplay between evaluation and model improvement underscores the deeply empirical nature of advancing generative AI responsibly. Testing often reveals that better solutions come from creativity in problem framing rather than pure technological advances alone.
Figure 1.3: Predictive and generative AI development lifecycle side-by-side comparison
As we can see from the preceding figure, the development lifecycle is an iterative process that enables us to realize value from a given use case and technology type. Across the rest of this chapter and this book, we are going to focus on generative AI general concepts, some that are going to be familiar if you are experienced in predictive AI and others that are specific to this new field in AI.