Applying our learning
It’s time to apply these concepts to our example projects. We will use what we have learned to explore each project dataset, from using Databricks Assistant to AutoML, to creating a vector search index and exploring image data.
Technical requirements
Before you begin, review, and prepare the technical requirements necessary for the hands-on work in this chapter:
- We use the
missingno
library to address missing numbers in our synthetic transactions project data: https://pypi.org/project/missingno/ - For the RAG project, you will need to install the following either on your cluster or in the
CH4-01-Creating_VectorDB
notebook. If you choose to install them in the notebook, the code is included for you:typing_extensions==4.7.1
transformers==4.30.2
llama-index==0.9.3
langchain==0.0.319
unstructured[pdf,docx]==0.10.30
Project – Favorita Store Sales – time-series forecasting
For the Favorita Store Sales project, we use many simple...