Technical requirements
To effectively utilize the resources and code examples provided in this chapter, ensure that your system meets the following technical requirements:
- Software requirements:
- Integrated development environment (IDE): We recommend using PyCharm as the preferred IDE for working with Python, and we might make specific references to PyCharm throughout this chapter. However, you are free to use any Python-compatible IDE of your choice.
- Jupyter Notebooks should be installed.
- Python version 3.6 or higher should be installed.
- Pipenv should be installed for managing dependencies.
- GitHub repository:
The associated code and resources for this chapter can be found in the following GitHub repository: https://github.com/PacktPublishing/Building-ETL-Pipelines-with-Python. We recommend that you fork and clone the repository to your local machine.
Exploring data cleansing and transformation
The extraction process is needed to select data that is significant in supporting...