A Primer on Python and the Development Environment
Whether your production environment caters to only one data pipeline at a time or a whole multitude of overlapping systems, the core tenets of data environment management remain the same.
We have dedicated this chapter to breaking down the foundational roots of all successful applications by discussing the basic principles of the Python programming language and how utilizing package management applications can create clean, flexible, and reproducible development environments. We will walk you through a step-by-step tutorial on how to install and establish a basic Git-tracked development environment that will prevent future confounding modular incompatibilities from impacting the successful deployment of your data pipelines in production.
By the end of this chapter, you will have a strong understanding of why Python is a powerful tool that can be used to develop highly-customized and powerful data transformation ecosystems. We will cover the following topics:
- Python fundamentals
- Using Python attributes to build an application’s foundation
- Key attributes of an effective development environment
- Downloading and installing a local integrated development environment (IDE)
- Creating and cloning a Git-tracked repository into your IDE
- Managing project packages and circular dependencies with a module management system (MMS)