Statistics and Visualization with NumPy and Pandas
One of the great advantages of using libraries such as NumPy and pandas is that a plethora of built-in statistical and visualization methods are available, for which we don't have to search for and write new code. Furthermore, most of these subroutines are written using C or Fortran code (and pre-compiled), making them extremely fast to execute.
Refresher of Basic Descriptive Statistics (and the Matplotlib Library for Visualization)
For any data wrangling task, it is quite useful to extract basic descriptive statistics from the data and create some simple visualizations/plots. These plots are often the first step in identifying fundamental patterns as well as oddities (if present) in the data. In any statistical analysis, descriptive statistics is the first step, followed by inferential statistics, which tries to infer the underlying distribution or process from which the data might have been generated.
As the inferential statistics are intimately...