Introduction
In Chapter 9, Practical Python – Advanced Topics, you looked at how to use GitHub to collaborate with team members. You also used conda
to document and set up the dependencies for Python programs and docker
to create reproducible Python environments to run our code.
We now shift gears to data science. Data science is booming like never before. Data scientists have become among the most sought-after practitioners in the world today. Most leading corporations have data scientists to analyze and explain their data.
Data analytics focuses on the analysis of big data. As each day goes by, there is more data than ever before — far too much for any human to analyze by sight. Leading Python developers such as Wes McKinney and Travis Oliphant addressed the gap by creating specialized Python libraries, in particular, pandas and NumPy to handle big data.
Taken together, pandas and NumPy are masterful at handling big data. They are built for speed, efficiency...