You're reading from Time Series Analysis with Python Cookbook Practical recipes for exploratory data analysis, data preparation, forecasting, and model evaluation

Product type Paperback

Published in Apr 2025

Publisher

ISBN-13 9781805124283

Length 98 pages

Edition 2nd Edition

Languages

Python

Concepts

Data Analysis

Author (1):

Tarek A. Atwan

View More author details

Table of Contents (13) Chapters

1. Time Series Analysis with Python Cookbook, Second Edition: Practical recipes for exploratory data analysis, data preparation, forecasting, and model evaluation FREE CHAPTER

2. Getting Started with Time Series Analysis

3. Reading Time Series Data from Files

4. Reading Time Series Data from Databases

5. Persisting Time Series Data to Files

6. Persisting Time Series Data to Databases

7. Working with Date and Time in Python

8. Handling Missing Data

9. Outlier Detection Using Statistical Methods

10. Exploratory Data Analysis and Diagnosis

11. Building Univariate Time Series Models Using Statistical Methods

12. Additional Statistical Modeling Techniques for Time Series

13. Outlier Detection Using Unsupervised Machine Learning

Detecting outliers using KNN

The KNN algorithm is typically used in a supervised learning setting where prior results or outcomes (labels) are known.

It can be used to solve classification or regression problems. The idea is simple; for example, you can classify a new data point, Y, based on its nearest neighbors. For instance, if k=5, the algorithm will find the five nearest data points (neighbors) by distance to the point Y and determine its class based on the majority. If there are three blue and two red nearest neighbors, Y is classified as blue. The K in KNN is a parameter you can modify to find the optimal value.

In the case of outlier detection, the algorithm is used differently. Since we do not know the outliers (labels) in advance, KNN is used in an unsupervised learning manner. In this scenario, the algorithm finds the closest K nearest neighbors for every data point and measures the average distance. The points with the most significant distance from the population will be...