You're reading from Deep Learning for Time Series Cookbook Use PyTorch and Python recipes for forecasting, classification, and anomaly detection

Product type Paperback

Published in Mar 2024

Publisher Packt

ISBN-13 9781805129233

Length 274 pages

Edition 1st Edition

Languages

Python

Tools

PyTorch

Concepts

Data Governance

Authors (2):

Luís Roque

Vitor Cerqueira

View More author details

Table of Contents (12) Chapters

Preface

1. Chapter 1: Getting Started with Time Series FREE CHAPTER

2. Chapter 2: Getting Started with PyTorch

3. Chapter 3: Univariate Time Series Forecasting

4. Chapter 4: Forecasting with PyTorch Lightning

5. Chapter 5: Global Forecasting Models

6. Chapter 6: Advanced Deep Learning Architectures for Time Series Forecasting

7. Chapter 7: Probabilistic Time Series Forecasting

8. Chapter 8: Deep Learning for Time Series Classification

9. Chapter 9: Deep Learning for Time Series Anomaly Detection

10. Index

Why subscribe?

11. Other Books You May Enjoy

Probabilistic forecasting with an LSTM

This recipe will walk you through building an LSTM neural network for probabilistic forecasting using PyTorch Lightning.

Getting ready

In this recipe, we’ll introduce probabilistic forecasting with LSTM networks. This approach combines the strengths of LSTM models in capturing long-term dependencies within sequential data with the nuanced perspective of probabilistic forecasting. This method goes beyond traditional point estimates by predicting a range of possible future outcomes, each accompanied by a probability. This means that we are incorporating uncertainty into forecasts.

This recipe uses the same dataset that we used in Chapter 4, in the Feedforward neural networks for multivariate time series forecasting recipe. We’ll also use the same data module we created in that recipe, which is called MultivariateSeriesDataModule.

Let’s explore how to use this data module to build an LSTM model for probabilistic forecasting.

How to do it…

In this subsection, we’ll define a probabilistic LSTM model that outputs the predictive mean and standard deviation for each forecasted point of the time series. This technique involves designing the LSTM model to predict parameters that define a probability distribution for future outcomes rather than outputting a single value. The model is usually configured to output parameters of a specific distribution, such as the mean and variance for a Gaussian distribution. These describe the expected value and the spread of future values, respectively:

Let’s start by defining a callback:

class LossTrackingCallback(Callback):
    def __init__(self):
        self.train_losses = []
        self.val_losses = []
    def on_train_epoch_end(self, trainer, pl_module):
        if trainer.logged_metrics.get("train_loss_epoch"):
            self.train_losses.append(
                trainer.logged_metrics["train_loss_epoch"].item())
    def on_validation_epoch_end(self, trainer, pl_module):
        if trainer.logged_metrics.get("val_loss_epoch"):
            self.val_losses.append(
                trainer.logged_metrics["val_loss_epoch"].item())

The LossTrackingCallback class is used to monitor the training and validation losses throughout the epochs. This is important for diagnosing the learning process of the model, identifying overfitting, and deciding when to stop training.

Then, we must build the LSTM model based on PyTorch Lightning’s LightningModule class:

class ProbabilisticLSTM(LightningModule):
    def __init__(self, input_size,
                 hidden_size, seq_len,
                 num_layers=2):
        super().__init__()
        self.save_hyperparameters()
        self.lstm = nn.LSTM(input_size, hidden_size,
            num_layers, batch_first=True)
        self.fc_mu = nn.Linear(hidden_size, 1)
        self.fc_sigma = nn.Linear(hidden_size, 1)
        self.hidden_size = hidden_size
        self.softplus = nn.Softplus()
    def forward(self, x):
        lstm_out, _ = self.lstm(x)
        lstm_out = lstm_out[:, -1, :]
        mu = self.fc_mu(lstm_out)
        sigma = self.softplus(self.fc_sigma(lstm_out))
        return mu, sigma

The ProbabilisticLSTM class defines the LSTM architecture for our probabilistic forecasts. The class includes layers to compute the predictive mean (fc_mu) and standard deviation (fc_sigma) of the forecast distribution. The standard deviation is passed through a Softplus() activation function to ensure it is always positive, reflecting the nature of standard deviation.

The following code implements the training and validation steps, along with the network configuration parameters:

    def training_step(self, batch, batch_idx):
        x, y = batch[0]["encoder_cont"], batch[1][0]
        mu, sigma = self.forward(x)
        dist = torch.distributions.Normal(mu, sigma)
        loss = -dist.log_prob(y).mean()
        self.log(
            "train_loss", loss, on_step=True,
            on_epoch=True, prog_bar=True, logger=True
        )
        return {"loss": loss, "log": {"train_loss": loss}}
    def validation_step(self, batch, batch_idx):
        x, y = batch[0]["encoder_cont"], batch[1][0]
        mu, sigma = self.forward(x)
        dist = torch.distributions.Normal(mu, sigma)
        loss = -dist.log_prob(y).mean()
        self.log(
            "val_loss", loss, on_step=True,
            on_epoch=True, prog_bar=True, logger=True
        )
        return {"val_loss": loss}
    def configure_optimizers(self):
        optimizer = optim.Adam(self.parameters(), lr=0.0001)
        scheduler = optim.lr_scheduler.ReduceLROnPlateau(
            optimizer, "min")
        return {
            "optimizer": optimizer,
            "lr_scheduler": scheduler,
            "monitor": "val_loss",
        }

After defining the model architecture, we initialize the data module and set up training callbacks. As we saw previously, the EarlyStopping callback is a valuable tool for preventing overfitting by halting the training process once the model ceases to improve on the validation set. The ModelCheckpoint callback ensures that we capture and save the best version of the model based on its validation performance. Together, these callbacks optimize the training process, aiding in developing a robust and well-tuned model:
```
datamodule = ContinuousDataModule(data=mvtseries)
datamodule.setup()
model = ProbabilisticLSTM(
    input_size = input_size, hidden_size=hidden_size,
    seq_len=seq_len
)
early_stop_callback = EarlyStopping(monitor="val_loss", 
    patience=5)
checkpoint_callback = ModelCheckpoint(
    dirpath="./model_checkpoint/", save_top_k=1, 
    monitor="val_loss"
)
loss_tracking_callback = LossTrackingCallback()
trainer = Trainer(
    max_epochs=100,
    callbacks=[early_stop_callback, checkpoint_callback,
    loss_tracking_callback],
)
trainer.fit(model, datamodule)
```
Using the Trainer class from PyTorch Lightning simplifies the training process, handling the complex training loops internally and allowing us to focus on defining the model and its behavior. It increases the code’s readability and maintainability, making experimenting with different model configurations easier.
After training, assessing the model’s performance and visualizing its probabilistic forecasts is very important. The graphical representation of the forecasted means, alongside their uncertainty intervals against the actual values, offers a clear depiction of the model’s predictive power and the inherent uncertainty in its predictions. We built a visualization framework to plot the forecasts. You can check the functions at the following link: https://github.com/PacktPublishing/Deep-Learning-for-Time-Series-Data-Cookbook.

The following figure illustrates the true values of our time series in blue, with the forecasted means depicted by the dashed red line:

Figure 7.4: Probabilistic forecasts with uncertainty intervals and true values

The shaded area represents the uncertainty interval, calculated as a standard deviation from the forecasted mean. This probabilistic approach to forecasting provides a more comprehensive picture than point estimates as it accounts for the variability and uncertainty inherent in the time series data. The overlap between the uncertainty intervals and the actual values indicates areas where the model has higher confidence in its predictions. Conversely, wider intervals may suggest periods of more significant uncertainty, potentially due to inherent noise in the data or complex underlying dynamics that the model finds more challenging to capture.

Moreover, the following figure provides insights into the training dynamics of our probabilistic LSTM model:

Figure 7.5: Training and validation loss over epochs, demonstrating the learning progress of the probabilistic LSTM model

The relatively stable and low validation loss suggests that our model generalizes well without overfitting the training data.

How it works…

The probabilistic LSTM model extends beyond traditional point prediction models. Unlike point forecasts, which output a single expected value, this model predicts a full distribution characterized by mean and standard deviation parameters.

This probabilistic approach provides a richer representation by capturing the uncertainty inherent in the data. The mean of the distribution gives the expected value of the forecast, while the standard deviation quantifies the confidence in the prediction, expressing the expected variability around the mean.

To train this model, we use a loss function that differs from those used in point prediction models. Instead of using MSE or MAE, which minimizes the difference between predicted and actual values, the probabilistic LSTM employs a negative log-likelihood loss function. This loss function, often called the probabilistic loss, maximizes the likelihood of the observed data under the predicted distribution.

This probabilistic loss function is particularly suited for uncertainty estimation as it directly penalizes the divergence between the predicted probability distribution and the observed values. When the predicted distribution assigns a high probability to the actual observed values, the negative log-likelihood is low, and thus the loss is low.

The rest of the chapter is locked

You're reading from Deep Learning for Time Series Cookbook Use PyTorch and Python recipes for forecasting, classification, and anomaly detection

Table of Contents (12) Chapters

Probabilistic forecasting with an LSTM

Getting ready

How to do it…

How it works…

Unlock this book and the full library FREE for 7 days

Authors (2)

Personalised recommendations for you