You're reading from Deep Learning for Time Series Cookbook Use PyTorch and Python recipes for forecasting, classification, and anomaly detection

Product type Paperback

Published in Mar 2024

Publisher Packt

ISBN-13 9781805129233

Length 274 pages

Edition 1st Edition

Languages

Python

Tools

PyTorch

Concepts

Data Governance

Authors (2):

Luís Roque

Vitor Cerqueira

View More author details

Table of Contents (12) Chapters

Preface

1. Chapter 1: Getting Started with Time Series FREE CHAPTER

2. Chapter 2: Getting Started with PyTorch

3. Chapter 3: Univariate Time Series Forecasting

4. Chapter 4: Forecasting with PyTorch Lightning

5. Chapter 5: Global Forecasting Models

6. Chapter 6: Advanced Deep Learning Architectures for Time Series Forecasting

7. Chapter 7: Probabilistic Time Series Forecasting

8. Chapter 8: Deep Learning for Time Series Classification

9. Chapter 9: Deep Learning for Time Series Anomaly Detection

10. Index

Why subscribe?

11. Other Books You May Enjoy

Dealing with heteroskedasticity

In this recipe, we delve into the variance of time series. The variance of a time series is a measure of how spread out the data is and how this dispersion evolves over time. You’ll learn how to handle data with a changing variance.

Getting ready

The variance of time series can change over time, which also violates stationarity. In such cases, the time series is referred to as heteroskedastic and usually shows a long-tailed distribution. This means the data is left- or right-skewed. This condition is problematic because it impacts the training of neural networks and other models.

How to do it…

Dealing with non-constant variance is a two-step process. First, we use statistical tests to check whether a time series is heteroskedastic. Then, we use transformations such as the logarithm to stabilize the variance.

We can detect heteroskedasticity using statistical tests such as the White test or the Breusch-Pagan test. The following code implements these tests based on the statsmodels library:

import statsmodels.stats.api as sms
from statsmodels.formula.api import ols
series_df = series_daily.reset_index(drop=True).reset_index()
series_df.columns = ['time', 'value']
series_df['time'] += 1
olsr = ols('value ~ time', series_df).fit()
_, pval_white, _, _ = sms.het_white(olsr.resid, olsr.model.exog)
_, pval_bp, _, _ = sms.het_breuschpagan(olsr.resid, olsr.model.exog)

The preceding code follows these steps:

Import the statsmodels modules ols and stats.
Create a DataFrame based on the values of the time series and the row they were collected at (1 for the first observation).
Create a linear model that relates the values of the time series with the time column.
Run het_white (White) and het_breuschpagan (Breusch-Pagan) to apply the variance tests.

The output of the tests is a p-value, where the null hypothesis posits that the time series has constant variance. So, if the p-value is below the significance value, we reject the null hypothesis and assume heteroskedasticity.

The simplest way to deal with non-constant variance is by transforming the data using the logarithm. This operation can be implemented as follows:

import numpy as np
class LogTransformation:
    @staticmethod
    def transform(x):
        xt = np.sign(x) * np.log(np.abs(x) + 1)
        return xt
    @staticmethod
    def inverse_transform(xt):
        x = np.sign(xt) * (np.exp(np.abs(xt)) - 1)
        return x

The preceding code is a Python class called LogTransformation. It contains two methods: transform() and inverse_transform(). The first transforms the data using the logarithm and the second reverts that operation.

We apply the transform() method to the time series as follows:

series_log = LogTransformation.transform(series_daily)

The log is a particular case of Box-Cox transformation that is available in the scipy library. You can implement this method as follows:

series_transformed, lmbda = stats.boxcox(series_daily)

The stats.boxcox() method estimates a transformation parameter, lmbda, which can be used to revert the operation.

How it works…

The transformations outlined in this recipe stabilize the variance of a time series. They also bring the data distribution closer to the Normal distribution. These transformations are especially useful for neural networks as they help avoid saturation areas. In neural networks, saturation occurs when the model becomes insensitive to different inputs, thus compromising the training process.

There’s more…

The Yeo-Johnson power transformation is similar to the Box-Cox transformation but allows for negative values in the time series. You can learn more about this method with the following link: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.yeojohnson.html.