Embedding feature creation in a scikit-learn pipeline
Throughout this chapter, we discussed how to automatically create and select features from time series data by utilizing tsfresh. Then, we used these features to train a classification model to predict whether an office was occupied at any given hour.
The tsfresh library offers wrapper classes around its main functions, extract_features
and extract_relevant_features
, to make features that have been created from time series compatible with the scikit-learn pipeline.
In this recipe, we will line up the process of creating features with tsfresh and training a logistic regression model in a scikit-learn pipeline.
How to do it...
Let’s begin by importing the necessary libraries and getting the dataset ready:
- Let’s import the required libraries and functions:
import pandas as pd from sklearn.pipeline import Pipeline from sklearn.linear_model import LogisticRegression from sklearn.model_selection import...