Identifying the standard deviation of a dataset
The standard deviation is derived from the variance and is simply the square root of the variance. The standard deviation is typically more intuitive because it is expressed in the same units as the dataset, for example, kilometers (km). On the other hand, the variance is typically expressed in units larger than the dataset and can be less intuitive, for example, kilometers squared (km2).
To analyze the standard deviation of a dataset, we will use the sd
method from the numpy
library in Python.
Getting ready
We will work with the COVID-19 cases again for this recipe.
How to do it…
We will compute the standard deviation using the numpy
libary:
- Import the
numpy
andpandas
libraries:import numpy as np import pandas as pd
- Load the
.csv
into a dataframe usingread_csv
. Then subset the dataframe to include only relevant columns:covid_data = pd.read_csv("covid-data.csv") covid_data = covid_data[[...