Time-Series Analysis with Python
Time-Series analysis revolves around getting familiar with a dataset and coming up with ideas and hypotheses. It can be thought of as "storytelling for data scientists" and is a critical step in machine learning, because it can inform and help shape tentative conclusions to test while training a machine learning model. Roughly speaking, the main difference between time-series analysis and machine learning is that time-series analysis does not include formal statistical modeling and inference.
While it can be daunting and seem complex, it is a generally very structured process. In this chapter, we will go through the fundamentals in Python for dealing with time-series patterns. In Python, we can do time-series analysis by interactively querying our data using a number of tools that we have at our fingertips. This starts from creating and loading time-series datasets to identifying trend and seasonality. We'll outline both the structure of time-series analysis, and the constituents both in terms of theory and practice in Python by going through examples.
The main example will use a dataset of air pollution in London and Delhi. You can find this example as a Jupyter notebook in the book's GitHub repository.
We're going to cover the following topics:
- What is time-series analysis?
- Working with time-series in Python
- Understanding the variables
- Uncovering relationships between variables
- Identifying trend and seasonality
We'll start with a characterization and an attempt at a definition of time-series analysis.