Optimizing data loading activities by controlling the data import method
From the aforementioned list, this chapter specifically focuses on utilizing different loading strategies to design efficient loading activities. Later in this book, we will dive deeper into automation Python modules and cloud resources (Chapters 8 and 9), and monitoring ETL pipelines (Chapter 14). But for now, let’s focus on the two most common methods for data loads: full and incremental.
Creating demo data
We will utilize Python’s sqlite3
database to walk through a demo of how full and incremental data loads can be formed using Python. From the chapter_06/
directory in your PyCharm environment, open the Loading_Transformed_Data.ipynb
file by initiating Jupyter in your PyCharm terminal by running the following command:
(venv) (base) usr@usr-MBP chapter_06 % jupyter notebook
Verify that the following code is in your Jupyter notebook:
# import modulesimport sqlite3 # demo...