Planning the MVP of the LLM Twin product
Now that we understand what an LLM Twin is and why we want to build it, we must clearly define the product’s features. In this book, we will focus on the first iteration, often labeled the minimum viable product (MVP), to follow the natural cycle of most products. Here, the main objective is to align our ideas with realistic and doable business objectives using the available resources to produce the product. Even as an engineer, as you grow up in responsibilities, you must go through these steps to bridge the gap between the business needs and what can be implemented.
What is an MVP?
An MVP is a version of a product that includes just enough features to draw in early users and test the viability of the product concept in the initial stages of development. Usually, the purpose of the MVP is to gather insights from the market with minimal effort.
An MVP is a powerful strategy because of the following reasons:
- Accelerated time-to-market: Launch a product quickly to gain early traction
- Idea validation: Test it with real users before investing in the full development of the product
- Market research: Gain insights into what resonates with the target audience
- Risk minimization: Reduces the time and resources needed for a product that might not achieve market success
Sticking to the V in MVP is essential, meaning the product must be viable. The product must provide an end-to-end user journey without half-implemented features, even if the product is minimal. It must be a working product with a good user experience that people will love and want to keep using to see how it evolves to its full potential.
Defining the LLM Twin MVP
As a thought experiment, let’s assume that instead of building this project for this book, we want to make a real product. In that case, what are our resources? Well, unfortunately, not many:
- We are a team of three people with two ML engineers and one ML researcher
- Our laptops
- Personal funding for computing, such as training LLMs
- Our enthusiasm
As you can see, we don’t have many resources. Even if this is just a thought experiment, it reflects the reality for most start-ups at the beginning of their journey. Thus, we must be very strategic in defining our LLM Twin MVP and what features we want to pick. Our goal is simple: we want to maximize the product’s value relative to the effort and resources poured into it.
To keep it simple, we will build the features that can do the following for the LLM Twin:
- Collect data from your LinkedIn, Medium, Substack, and GitHub profiles
- Fine-tune an open-source LLM using the collected data
- Populate a vector database (DB) using our digital data for RAG
- Create LinkedIn posts leveraging the following:
- User prompts
- RAG to reuse and reference old content
- New posts, articles, or papers as additional knowledge to the LLM
- Have a simple web interface to interact with the LLM Twin and be able to do the following:
- Configure your social media links and trigger the collection step
- Send prompts or links to external resources
That will be the LLM Twin MVP. Even if it doesn’t sound like much, remember that we must make this system cost effective, scalable, and modular.
Even if we focus only on the core features of the LLM Twin defined in this section, we will build the product with the latest LLM research and best software engineering and MLOps practices in mind. We aim to show you how to engineer a cost-effective and scalable LLM application.
Until now, we have examined the LLM Twin from the users’ and businesses’ perspectives. The last step is to examine it from an engineering perspective and define a development plan to understand how to solve it technically. From now on, the book’s focus will be on the implementation of the LLM Twin.