Downsampling time series data
Downsampling reduces the number of samples in the data. During this reduction, we are able to apply aggregations over data points. Let's imagine a busy airport with thousands of people passing through every hour. The airport administration has installed a visitor counter in the main area, to get an impression of exactly how busy their airport is.
They are receiving data from the counter device every minute. Here are the hypothetical measurements for a day, beginning at 08:00, ending 600 minutes later at 18:00:
>>> rng = pd.date_range('4/29/2015 8:00', periods=600, freq='T') >>> ts = pd.Series(np.random.randint(0, 100, len(rng)), index=rng) >>> ts.head() 2015-04-29 08:00:00 9 2015-04-29 08:01:00 60 2015-04-29 08:02:00 65 2015-04-29 08:03:00 25 2015-04-29 08:04:00 19
To get a better picture of the day, we can downsample this time series to larger intervals, for example, 10 minutes. We can choose an aggregation function...