Just as you ought to assess outliers and extreme values in the variables being analyzed, you should also assess the missing responses in the variables being analyzed. For a given variable, what number or fraction of responses is missing? What is or are the mechanisms by which missing values happen? Is the missingness in a variable related to values on another variable or perhaps that same variable? Fully addressing these questions in the context of your data can be hard work, and a full discussion is beyond the scope of this book. Here, we briefly address why missing data matters and show some analyses that you can do.
Why should you be concerned about missing data?
There are two reasons:
- Statistical efficiency
- Bias
Statistical efficiency has to do with the relationship between sample size and precision. If your data is a random sample from a population, then along...