Prediction-based anomaly detection using DL
We continue to explore prediction-based methods in this recipe. This time, we’ll create a forecasting model based on DL. Besides, we’ll use the point forecasts’ error as a reference for detecting anomalies.
Getting ready
We’ll use a time series dataset about the number of taxi trips in New York City. This dataset is considered a benchmark problem for time series anomaly detection tasks. You can check the source at the following link: https://databank.illinois.edu/datasets/IDB-9610843.
Let’s start by loading the time series using pandas
:
from datetime import datetime import pandas as pd dataset = pd.read_csv('assets/datasets/taxi/taxi_data.csv') labels = pd.read_csv('assets/datasets/taxi/taxi_labels.csv') dataset['ds'] = pd.Series([datetime.fromtimestamp(x) for x in dataset['timestamp']]) dataset = dataset.drop('timestamp&apos...