Hands-on – ingesting study materials into our PITS
It’s time for some practice. We now have everything we need to continue building our project. Let’s write the documend_uploader.py
module.
This module will take care of ingesting and preparing our available study material. The user can upload any available books, technical documentation, or existing articles to provide more context to our tutor.
- First, we have the imports:
from global_settings import STORAGE_PATH, CACHE_FILE from logging_functions import log_action from llama_index import SimpleDirectoryReader, VectorStoreIndex from llama_index.ingestion import IngestionPipeline, IngestionCache from llama_index.text_splitter import TokenTextSplitter from llama_index.extractors import SummaryExtractor from llama_index.embeddings import OpenAIEmbedding
- Next, we must define the main function that’s responsible for handling the ingestion process. You’ll notice that it uses an ingestion pipeline...