Extracting hundreds of features automatically from a time series
Time series are data points indexed in time order. Analyzing time-series sequences allows us to make various predictions. For example, sensor data can be used to predict pipeline failures, sound data can help identify music genres, health history or personal measurements such as glucose levels can indicate whether a person is sick, and, as we will show in this recipe, patterns of light usage, humidity, and levels can determine whether an office is occupied.
To train regression and classification models using traditional machine learning algorithms, such as linear regression or random forests, we require a dataset of size M x N, where M is the number of rows and N is the number of features or columns. However, with time-series data, what we have is a collection of M time series, and each time series has multiple rows indexed chronologically. To use time series in supervised learning models, each time series needs to...