Resampling and granularity
Time series data has its own set of common data cleansing and preprocessing steps, and these are especially important when working with IoT data. Sensors often produce data with gaps, outliers, or missing values. It’s not necessarily because sensors are less reliable than other data sources, but the sheer frequency with which we receive data points means we’re more likely to have these types of errors.
In the next few sections, we’ll recap some of the most common techniques we apply when preparing our Time Series data for analysis and modeling: aligning timestamps, correcting missing values, and aggregating.
Aligning data timestamps
The most common issue I’ve run into when analyzing IoT, specifically when plugging directly into a sensor, is irregularly spaced timestamps. For some types of analysis, this may not be a problem. Some of the methods in this chapter (mean value forecast, naïve forecast, and linear regression...