In order to demonstrate the visualizing patterns of missing data, we first have to create some missing data. This will also be the same dataset that we perform analysis on later in the chapter. To showcase how to use multiple imputation in a semi-realistic scenario, we are going to create a version of the mtcars dataset with a few missing values:
Okay, let's set the seed (for deterministic randomness), and create a variable to hold our new marred dataset, using the following code:
set.seed(2) miss_mtcars <- mtcars
First, we are going to create seven missing values in drat (about 20 percent), five missing values in the mpg column (about 15 percent), five missing values in the cyl column, three missing values in wt (about 10 percent), and three missing values
in vs:
some_rows <- sample(1:nrow(miss_mtcars), 7) miss_mtcars$drat[some_rows] <...