Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Time Series Indexing

You're reading from   Time Series Indexing Implement iSAX in Python to index time series with confidence

Arrow left icon
Product type Paperback
Published in Jun 2023
Publisher Packt
ISBN-13 9781838821951
Length 248 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Mihalis Tsoukalos Mihalis Tsoukalos
Author Profile Icon Mihalis Tsoukalos
Mihalis Tsoukalos
Arrow right icon
View More author details
Toc

Table of Contents (11) Chapters Close

Preface 1. Chapter 1: An Introduction to Time Series and the Required Python Knowledge 2. Chapter 2: Implementing SAX FREE CHAPTER 3. Chapter 3: iSAX – The Required Theory 4. Chapter 4: iSAX – The Implementation 5. Chapter 5: Joining and Comparing iSAX Indexes 6. Chapter 6: Visualizing iSAX Indexes 7. Chapter 7: Using iSAX to Approximate MPdist 8. Chapter 8: Conclusions and Next Steps 9. Index 10. Other Books You May Enjoy

The tsfresh Python package

This is a bonus section not directly related to the subject of the book, but it is helpful, nonetheless. It is about a handy Python package called tsfresh, which can give you a good overview of your time series from a statistical perspective. We are not going to present all the capabilities of tsfresh, just the ones that you can easily use to get information about your time series data – at this point, you might need to install tsfresh on your machine. Keep in mind that the tsfresh package has lots of package dependencies.

So, we are going to compute the following properties of a dataset – in this case, a time series:

  • Mean value: The mean value of a dataset is the summary of all the values divided by the number of values.
  • Standard deviation: The standard deviation of a dataset measures the amount of variation in it. There is a formula to calculate the standard deviation, but we usually compute it using a function from a Python package.
  • Skewness: The skewness of a dataset is a measure of the asymmetry in it. The value of skewness can be positive, negative, zero, or undefined.
  • Kurtosis: The kurtosis of a dataset is a measure of the tailedness of a dataset. In more mathematical terms, kurtosis measures the heaviness of the tail of a distribution compared to a normal distribution.

All these quantities will make much more sense once you plot your data, which is left as an exercise for you; otherwise, they will be just numbers. So, now that we know some basic statistic terms, let us present a Python script that calculates all these quantities for a time series.

The Python code for using_tsfresh.py is as follows:

#!/usr/bin/env python3
import sys
import pandas as pd
import tsfresh
def main():
     if len(sys.argv) != 2:
           print("TS")
           sys.exit()
     TS1 = sys.argv[1]
     ts1Temp = pd.read_csv(TS1, compression='gzip')
     ta = ts1Temp.to_numpy()
     ta = ta.reshape(len(ta))
     # Mean value
     meanValue = tsfresh.feature_extraction.feature_calculators.mean(ta)
     print("Mean value:\t\t", meanValue)
     # Standard deviation
     stdDev = tsfresh.feature_extraction.feature_calculators.standard_deviation(ta)
     print("Standard deviation:\t", stdDev)
     # Skewness
     skewness = tsfresh.feature_extraction.feature_calculators.skewness(ta)
     print("Skewness:\t\t", skewness)
     # Kurtosis
     kurtosis = tsfresh.feature_extraction.feature_calculators.kurtosis(ta)
     print("Kurtosis:\t\t", kurtosis)
if __name__ == '__main__':
     main()

The output of using_tsfresh.py when processing ts1.gz should look similar to the following:

$ ./using_tsfresh.py ts1.gz
Mean value:  15.706410001204729
Standard deviation:  8.325017802111901
Skewness:     0.008971113265160474
Kurtosis:    -1.2750042973761417

The tsfresh package can do many more things; we have just presented the tip of the iceberg of the capabilities of tsfresh.

The next section is about creating a histogram of a time series.

You have been reading a chapter from
Time Series Indexing
Published in: Jun 2023
Publisher: Packt
ISBN-13: 9781838821951
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image