Imputing in-stream mean or median
Filling missing values with the mean or median is a common approach to removing missing values. Modeler has mechanisms for computing and filling missing values using either the Set Globals node or the Data Audit node. Unfortunately, both of these are terminal nodes and therefore require the user to run them as a separate step or as a script. Moreover, the options for which values to impute with are limited to the mean, mid-point, or (in the case of the Data Audit node) a constant.
In this recipe we will impute missing values with the median of a variable in-stream, without the use of @GLOBAL
variables.
Getting ready
This recipe uses the following files:
- Datafile:
cup98lrn_reduced_vars3.sav
- Stream file:
Recipe - impute missing with fixed value.str
How to do it...
To impute missing values with the median of a variable:
- Open the stream (
Recipe - impute missing with fixed value.str
) by going to File | Open Stream. - Make sure the datafile points to the correct path to...