Checking the variance of a dataset
Just like we may want to know where the center of a dataset lies, we may also want to know how widely spread the dataset is, for example, how far apart the numbers in the dataset are from each other. The variance helps us achieve this. Unlike the mean, median, and mode, which give us a sense of where the center of the dataset lies, the variance gives us a sense of the spread of a dataset or the variability.
It is a very useful statistic, especially when used alongside a dataset’s mean, median, and mode.
To analyze the variance of a dataset, we will use the var
method from the numpy
library in Python.
Getting ready
We will work with the COVID-19 cases again for this recipe.
How to do it…
We will compute the variance using the numpy
library:
- Import the
numpy
andpandas
libraries:import numpy as np import pandas as pd
- Load the
.csv
into a dataframe usingread_csv
. Then subset the dataframe to include only...