You're reading from Building Data-Driven Applications with LlamaIndex A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications

Product type Paperback

Published in May 2024

Publisher Packt

ISBN-13 9781835089507

Length 368 pages

Edition 1st Edition

Languages

Python

Tools

LlamaIndex

Concepts

GPT/LLMs

Author (1):

Andrei Gheorghiu

View More author details

Table of Contents (18) Chapters

Preface

1. Part 1:Introduction to Generative AI and LlamaIndex FREE CHAPTER

2. Chapter 1: Understanding Large Language Models

3. Chapter 2: LlamaIndex: The Hidden Jewel - An Introduction to the LlamaIndex Ecosystem

4. Part 2: Starting Your First LlamaIndex Project

5. Chapter 3: Kickstarting Your Journey with LlamaIndex

6. Chapter 4: Ingesting Data into Our RAG Workflow

7. Chapter 5: Indexing with LlamaIndex

8. Part 3: Retrieving and Working with Indexed Data

9. Chapter 6: Querying Our Data, Part 1 – Context Retrieval

10. Chapter 7: Querying Our Data, Part 2 – Postprocessing and Response Synthesis

11. Chapter 8: Building Chatbots and Agents with LlamaIndex

12. Part 4: Customization, Prompt Engineering, and Final Words

13. Chapter 9: Customizing and Deploying Our LlamaIndex Project

14. Chapter 10: Prompt Engineering Guidelines and Best Practices

15. Chapter 11: Conclusion and Additional Resources

16. Index

Why subscribe?

17. Other Books You May Enjoy

Ingesting data via LlamaHub

As we saw in Chapter 3, Kickstarting Your Journey with LlamaIndex, one of the first steps in a RAG workflow is to ingest and process our proprietary data. We already discovered the concepts of documents and nodes, which are used to organize the data and prepare it for indexing. I’ve also briefly introduced the LlamaHub data loaders as a way to easily ingest data into LlamaIndex. It’s time to examine these steps in more detail and gradually learn how to infuse LLM applications with our own, proprietary knowledge. Before we continue, though, I’d like to emphasize some very common challenges encountered at this step:

No matter how effective our RAG pipeline is, at the end of the day, the quality of the final result will largely depend on the quality of the initial data. To overcome this challenge, make sure you start by cleaning up your data first. Eliminate potential duplicates and errors. While not exactly duplicates, redundant...