Chunking strategies
In our last discussion, we delved into vector DBs and RAG. Before diving into RAG, we need to efficiently house our embedded data. While we touched upon indexing methods to speed up data fetching, there’s another crucial step to take even before that: chunking.
What is chunking?
In the context of building LLM applications with embedding models, chunking involves dividing a long piece of text into smaller, manageable pieces or “chunks” that fit within the model’s token limit. The process involves breaking text into smaller segments before sending these to the embedding models. As shown in the following image, chunking happens before the embedding process. Different documents have different structures, such as free-flowing text, code, or HTML. So, different chunking strategies can be applied to attain optimal results. Tools such as Langchain provide you with functionalities to chunk your data efficiently based on the nature of the...