Aggregating weekly crime and traffic accidents separately
The Denver crime dataset has all crime and traffic accidents together in one table, and separates them through the binary columns: IS_CRIME
and IS_TRAFFIC
. The .resample
method allows you to group by a period of time and aggregate specific columns separately.
In this recipe, we will use the .resample
method to group by each quarter of the year and then sum up the number of crimes and traffic accidents separately.
How to do it…
- Read in the crime hdf5 dataset, set the index as
REPORTED_DATE
, and then sort it to increase performance for the rest of the recipe:>>> crime = (pd.read_hdf('data/crime.h5', 'crime') ... .set_index('REPORTED_DATE') ... .sort_index() ... )
- Use the
.resample
method to group by each quarter of the year and then sum theIS_CRIME
andIS_TRAFFIC
columns for each group:>>> (crime ... .resample...