Selecting variables using the Means node
In this recipe we will identify and select variables to include as model inputs using the Means node.
Getting ready
This recipe uses the datafile cup98lrn_reduced_vars3.sav
and the stream recipe_variableselection_means.str
.
You will need a copy of Microsoft Excel to visualize the list of rules (optional).
How to do it...
To identify and select variables to include as model inputs using the Means node:
- Open the stream
variableselection_means.str
by navigating File | Open Stream. - Make sure the datafile points to the correct path to the file
cup98lrn_reduced_vars3.sav
. - Open the Means node to look at the options. Note that the grouping variable is our target variable TARGET_B, and the test fields are all the continuous variables of interest as shown in the following figure.
- Run the Means node by clicking on Run.
- Inside the output window, click on the
Importance
column twice so that the variables are sorted in descending order of Importance as shown in the following...