Parsing the documents into nodes
As we saw in Chapter 3, Kickstarting Your Journey with LlamaIndex, the next step is to split the documents into nodes. In many cases, documents tend to be very large, so we need to break them down into smaller units called nodes. Working at this granular level allows for better handling of our content while maintaining an accurate representation of its internal structure. This is the basic mechanism that LlamaIndex uses to manage our proprietary data content more easily.
Now is the time to understand how nodes can be generated in LlamaIndex and what customization opportunities we have along the way. In the previous chapter, we talked about how to manually create nodes. But that was merely a way to simplify the explanation and help you better understand their mechanics. In a real application, most likely, we will want to use some automatic methods to generate them from the ingested documents. So, that’s what we’ll focus on going forward...