Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Deep Learning for Time Series Cookbook

You're reading from   Deep Learning for Time Series Cookbook Use PyTorch and Python recipes for forecasting, classification, and anomaly detection

Arrow left icon
Product type Paperback
Published in Mar 2024
Publisher Packt
ISBN-13 9781805129233
Length 274 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Luís Roque Luís Roque
Author Profile Icon Luís Roque
Luís Roque
Vitor Cerqueira Vitor Cerqueira
Author Profile Icon Vitor Cerqueira
Vitor Cerqueira
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Chapter 1: Getting Started with Time Series 2. Chapter 2: Getting Started with PyTorch FREE CHAPTER 3. Chapter 3: Univariate Time Series Forecasting 4. Chapter 4: Forecasting with PyTorch Lightning 5. Chapter 5: Global Forecasting Models 6. Chapter 6: Advanced Deep Learning Architectures for Time Series Forecasting 7. Chapter 7: Probabilistic Time Series Forecasting 8. Chapter 8: Deep Learning for Time Series Classification 9. Chapter 9: Deep Learning for Time Series Anomaly Detection 10. Index 11. Other Books You May Enjoy

Resampling a time series

Time series resampling is the process of changing the frequency of a time series, for example, from hourly to daily. This task is a common preprocessing step in time series analysis and this recipe shows how to do it with pandas.

Getting ready

Changing the frequency of a time series is a common preprocessing step before analysis. For example, the time series used in the preceding recipes has an hourly granularity. Yet, our goal may be to study daily variations. In such cases, we can resample the data into a different period. Resampling is also an effective way of handling irregular time series – those that are collected in irregularly spaced periods.

How to do it…

We’ll go over two different scenarios where resampling a time series may be useful: when changing the sampling frequency and when dealing with irregular time series.

The following code resamples the time series into a daily granularity:

series_daily = series.resample('D').sum()

The daily granularity is specified with the input D to the resample () method. The values of each corresponding day are summed together using the sum() method.

Most time series analysis methods work under the assumption that the time series is regular; in other words, it is collected in regularly spaced time intervals (for example, every day). But some time series are naturally irregular. For instance, the sales of a retail product occur at arbitrary timestamps as customers arrive at a store.

Let us simulate sale events with the following code:

import numpy as np
import pandas as pd
n_sales = 1000
start = pd.Timestamp('2023-01-01 09:00')
end = pd.Timestamp('2023-04-01')
n_days = (end – start).days + 1
irregular_series = pd.to_timedelta(np.random.rand(n_sales) * n_days,
                                   unit='D') + start

The preceding code creates 1000 sale events from 2023-01-01 09:00 to 2023-04-01. A sample of this series is shown in the following table:

ID

Timestamp

1

2023-01-01 15:18:10

2

2023-01-01 15:28:15

3

2023-01-01 16:31:57

4

2023-01-01 16:52:29

5

2023-01-01 23:01:24

6

2023-01-01 23:44:39

Table 1.2: Sample of an irregular time series

Irregular time series can be transformed into a regular frequency by resampling. In the case of sales, we will count how many sales occurred each day:

ts_sales = pd.Series(0, index=irregular_series)
tot_sales = ts_sales.resample('D').count()

First, we create a time series of zeros based on the irregular timestamps (ts_sales). Then, we resample this dataset into a daily frequency (D) and use the count() method to count how many observations occur each day. The tot_sales reconstructed time series can be used for other tasks, such as forecasting daily sales.

How it works…

A sample of the reconstructed time series concerning solar radiation is shown in the following table:

Datetime

Incoming Solar

2007-10-01

1381.5

2007-10-02

3953.2

2007-10-03

3098.1

2007-10-04

2213.9

Table 1.3: Solar radiation time series after resampling

Resampling is a cornerstone preprocessing step in time series analysis. This technique can be used to change a time series into a different granularity or to convert an irregular time series into a regular one.

The summary statistic is an important input to consider. In the first case, we used sum to add the hourly solar radiation values observed each day. In the case of the irregular time series, we used the count() method to count how many events occurred in each period. Yet, you can use other summary statistics according to your needs. For example, using the mean would take the average value of each period to resample the time series.

There’s more…

We resampled to daily granularity. A list of available options is available here: https://pandas.pydata.org/docs/user_guide/timeseries.html#dateoffset-objects.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime