Substituting missing values using the mice package
Finding and removing missing values in your dataset is not always a viable alternative, for either operative or methodological reasons. It is often preferable to simulate possible values for missing data and integrate those values within the observed data.
This recipe is based on the mice
package by Stef van Buuren. It provides an efficient algorithm for missing value substitution based on the multiple imputation technique.
Note
Multiple imputation technique
The multiple imputation technique is a statistical solution to the problem of missing values.
The main idea behind this technique is to draw possible alternative values for each missing value and then, after a proper analysis of simulated values, populating the original dataset with synthetic data.
Getting ready
This recipe requires that you install and load the mice package:
install.packages("mice") library(mice)
For illustrative purposes, we will use the tidy_gdp
data frame created...