Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Practical Time Series Analysis

You're reading from   Practical Time Series Analysis Master Time Series Data Processing, Visualization, and Modeling using Python

Arrow left icon
Product type Paperback
Published in Sep 2017
Publisher Packt
ISBN-13 9781788290227
Length 244 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Avishek Pal Avishek Pal
Author Profile Icon Avishek Pal
Avishek Pal
PKS Prakash PKS Prakash
Author Profile Icon PKS Prakash
PKS Prakash
Arrow right icon
View More author details
Toc

Models for time series analysis

The purpose of time series analysis is to develop a mathematical model that can explain the observed behavior of a time series and possibly forecast the future state of the series. The chosen model should be able to account for one or more of the internal structures that might be present. To this end, we will give an overview of the following general models that are often used as building blocks of time series analysis:

  • Zero mean models
  • Random walk
  • Trend models
  • Seasonality models

Zero mean models

The zero-mean models have a constant mean and constant variance and shows no predictable trends or seasonality. Observations from a zero mean model are assumed to be independent and identically distributed (iid) and represent the random noise around a fixed mean, which has been deducted from the time series as a constant term.

Let us consider that X1, X2, ... ,Xn represent the random variables corresponding to n observations of a zero mean model. If x1, x2, ... ,xn are n observations from the zero mean time series, then the joint distribution of the observations is given as a product of probability mass function for every time index as follows:

P(X1 = x1,X2 = x2 , ... , Xn = xn) = f(X1 = x1) f(X2 = x2) ... f(Xn = xn)

Most commonly f(Xt = xt) is modeled by a normal distribution of mean zero and variance σ 2, which is assumed to be the irreducible error of the model and hence treated as a random noise. The following figure shows a zero-mean series of normally distributed random noise of unit variance:

Figure 1.12: Zero-mean time series

The preceding plot is generated by the following code:

import os 
import numpy as np 
%matplotlib inline 
from matplotlib import pyplot as plt 
import seaborn as sns 
os.chdir('D:/Practical Time Series/') 
zero_mean_series = np.random.normal(loc=0.0, scale=1., size=100) 

The zero mean with constant variance represents a random noise that can assume infinitely possible real values and is suited for representing irregular variations in the time series of a continuous variable. However in many cases, the observable state of the system or process might be discrete in nature and confined to a finite number of possible values s1,s2, ... , sm. In such cases, the observed variable (X) is assumed to obey the multinomial distribution, P(X = s1 )= p1, P(X = s2 ) = p2,…,P(X = sm) = pm such that p1 + p2 + ... + pm = 1. Such a time series is a discrete stochastic process.

Multiple throws a dice over time is an example of a discrete stochastic process with six possible outcomes for any throw. A simpler discrete stochastic process is a binary process such as tossing a coin such as only two outcomes namely head and tail. The following figure shows 100 runs from a simulated process of throwing a biased dice for which probability of turning up an even face is higher than that of showing an odd face. Note the higher number of occurrences of even faces, on an average, compared to the number of occurrences of odd faces.

Random walk

A random walk is given as a sum of n iids, which has zero mean and constant variance. Based on this definition, the realization of a random walk at time index t is given by the sum S = x1 + x2 + ... + xn. The following figure shows the random walk obtained from iids, which vary according to a normal distribution of zero mean and unit variance.

The random walk is important because if such behavior is found in a time series, it can be easily reduced to zero mean model by taking differences of the observations from two consecutive time indices as St - St-1 = xt is an iid with zero mean and constant variance.

Figure 1.13: Random walk time series

The random walk in the preceding figure can be generated by taking the cumulative sum of the zero mean model discussed in the previous section. The following code implements this:

random_walk = np.cumsum(zero_mean_series) 
plt.figure(figsize=(5.5, 5.5)) 
g = sns.tsplot(random_walk) 
g.set_title('Random Walk') 
g.set_xlabel('Time index') 

Trend models

This type of model aims to capture the long run trend in the time series that can be fitted as linear regression of the time index. When the time series does not exhibit any periodic or seasonal fluctuations, it can be expressed just as the sum of the trend and the zero mean model as xt = μ(t) + yt, where μ(t) is the time-dependent long run trend of the series.

The choice of the trend model μ(t) is critical to correctly capturing the behavior of the time series. Exploratory data analysis often provides hints for hypothesizing whether the model should be linear or non-linear in t. A linear model is simply μ(t) = wt + b, whereas quadratic model is μ(t) = w1t + w2t2 + b. Sometimes, the trend can be hypothesized by a more complex relationship in terms of the time index such as μ(t) = w0tp + b.

The weights and biases in the trend modes such as the ones discussed previously is obtained by running a regression with t as the explanatory variable and μ as the explained. The residuals xt - μ(t) of the trend model is considered to the irreducible noise and as realization of the zero mean component yt.

Seasonality models

Seasonality manifests as periodic and repetitive fluctuations in a time series and hence are modeled as sum of weighted sum of sine waves of known periodicity. Assuming that long run trend has been removed by a trend line, the seasonality model can be expressed as xt = st + yt, where the seasonal variation with known periodicity is α.

Seasonality models are also known as harmonic regression model as they attempt to fit the sum of multiple sin waves.

The four models described here are building blocks of a fully-fledged time series model. As you might have gathered by now, a zero sum model represents irreducible error of the system and all of other three models aim to transform a given time series to the zero sum models through suitable mathematical transformations. To get forecasts in terms of the original time series, relevant inverse transformations are applied.

The upcoming chapters detail the four models discussed here. However, we have reached a point where we can summarize the generic approach of a time series analysis in the following four steps:

  • Visualize the data at different granularities of the time index to reveal long run trends and seasonal fluctuations
  • Fit trend line capture long run trends and plot the residuals to check for seasonality or irreducible error
  • Fit a harmonic regression model to capture seasonality
  • Plot the residuals left by the seasonality model to check for irreducible error

These steps are most commonly enough to develop mathematical models for most time series. The individual trend and seasonality models can be simple or complex depending on the original time series and the application.

The code written in this section can be found in the Chapter_1_Models_for_Time_Series_Analysis.ipynb IPython notebook located in the code folder of this book's GitHub repository.
You have been reading a chapter from
Practical Time Series Analysis
Published in: Sep 2017
Publisher: Packt
ISBN-13: 9781788290227
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image