EDA techniques and tools
There are numerous EDA techniques and tools available to data scientists, analysts, and decision-makers.
Some of the most used methods for EDA are mentioned in the following subsections.
Descriptive statistics
The simplest form of EDA involves calculating the summary statistics we covered in the previous chapter, such as the mean, median, mode, standard deviation, and range, to provide an initial understanding of the data’s central tendencies and dispersion.
Code example
Here, we will show you an example of how to calculate the mean, median, mode, standard deviation, and range for an example dataset showing monthly sales figures for a year.
For each code snippet, you can copy and paste it into Google Colab and press Shift + Enter to run them.
Open your code editor and run the following code to calculate the mean value:
import pandas as pd # Define a toy dataset representing monthly sales figures for a year sales_data_year1 = pd...