Calculating summaries by group with NumPy arrays
We can accomplish much of what we did in the previous recipe with itertuples
using NumPy arrays. We can also use NumPy arrays to get summary values for subsets of our data.
Getting ready
We will work again with the COVID-19 daily data and the Brazil land temperature data.
How to do it…
We copy DataFrame values to a NumPy array. We then navigate over the array, calculating totals by group and checking for unexpected changes in values:
- Import
pandas
andnumpy
, and load the COVID-19 and land temperature data:import pandas as pd coviddaily = pd.read_csv("data/coviddaily.csv", parse_dates=["casedate"]) ltbrazil = pd.read_csv("data/ltbrazil.csv")
- Create a list of locations:
loclist = coviddaily.location.unique().tolist()
- Use a NumPy array to calculate sums by location.
Create a NumPy array of the location and new cases data...