Transforming data into the time series format
We will start by understanding how to convert a sequence of observations into time series data and visualize it. We will use a library called pandas to analyze time series data. Make sure that you install pandas before you proceed further. You can find the installation instructions at http://pandas.pydata.org/pandas-docs/stable/install.html.
How to do it…
Create a new Python file, and import the following packages:
import numpy as np import pandas as pd import matplotlib.pyplot as plt
Let's define a function to read an input file that converts sequential observations into time-indexed data:
def convert_data_to_timeseries(input_file, column, verbose=False):
We will use a text file consisting of four columns. The first column denotes the year, the second column denotes the month, and the third and fourth columns denote data. Let's load this into a NumPy array:
# Load the input file data = np.loadtxt(input_file, delimiter=',')
As this is arranged...