Introduction
In Chapter 9, Practical Python – Advanced Topics, you learned how to use GitHub to collaborate with team members. You also used conda
to document and set up the dependencies for Python programs and docker
to create reproducible Python environments to run our code.
We will now shift gears to data science. Data science is booming like never before. Data scientists have become among the most sought-after practitioners in the world today. Most leading corporations have data scientists to analyze and explain their data.
Data analytics focuses on the analysis of big data. As each day goes by, there is more data than ever before – far too much for any human to analyze by sight. Leading Python developers such as Wes McKinney and Travis Oliphant addressed this gap by creating specialized Python libraries – in particular, pandas and NumPy – to handle big data.
Taken together, pandas and NumPy are masterful at handling big data. They are built...