Handling longer periods of missing data
We saw some techniques for handling missing data earlier – forward and backward filling, interpolation, and so on. Those techniques usually work if there are one or two missing data points. But if a large section of data is missing, then these simple techniques fall short.
Notebook alert
To follow along with the complete code for missing data imputation, use the 03 - Handling Missing Data (Long Gaps).ipynb
notebook in the chapter02
folder.
Let’s read blocks 0-7 parquet
from memory:
block_df = pd.read_parquet("data/london_smart_meters/preprocessed/london_smart_meters_merged_block_0-7.parquet")
The data that we have saved is in compact form. We need to convert it into expanded form because it is easier to work with time series data in that form. Since we only need a subset of the time series (for faster demonstration purposes), we are just extracting one block from these seven blocks. To convert compact form...