Summary
In this chapter, we saw how we can take a dataset and then analyze what it holds, before moving on to cleaning. We looked at how pandas give us a lot of powerful tools that allow us to quickly pull in CSV data, calculate basic statistics, and clean up issues such as missing values using functions such as forward fill and backfill.
We then looked at how we can bring some visual flair to the underlying data with Matplotlib to create bar charts and scatterplots. This tool is a vital component in being able to get a better sense of data that you have and to easily convey the information and analysis to other colleagues.
These two tools, pandas and Matplotlib, are ones you will come back to repeatedly. We are now equipped with Conda, Jupyter notebooks, NumPy, pandas, and Matplotlib. Using just these tools, you will already be able to answer many questions in the real world such as, do more students pick a major with higher pay? Even though we can get these simple answers...