The EM algorithm for the imputation of missing values
The EM algorithm is extensively used for the imputation of missing values. Implementations include (van Buuren and Groothuis-Oudshoorn 2011), (Schafer 1997), (Templ, Alfons, and Filzmoser 2011), (Raghunathan et al. 2001), and (Gelman and Hill 2011). In the following we want to show how an EM algorithm works generally for these kind of problems.
First we take a data set to impute. We select again the sleep
data:
library("MASS") library("robustbase") library("VIM") data("sleep") str(sleep) ## 'data.frame': 62 obs. of 10 variables: ## $ BodyWgt : num 6654 1 3.38 0.92 2547 ... ## $ BrainWgt: num 5712 6.6 44.5 5.7 4603 ... ## $ NonD : num NA 6.3 NA NA 2.1 9.1 15.8 5.2 10.9 8.3 ... ## $ Dream : num NA 2 NA NA 1.8 0.7 3.9 1 3.6 1.4 ... ## $ Sleep : num 3.3 8.3 12.5 16.5 3.9 9.8 19.7 6.2 14.5 9.7 ... ## $ Span : num 38.6 4.5 14 NA 69 27 19 30.4 28 50 ... ## $ Gest : num 645 42 60 25 624 180 35 392 63 230 ... ## ...