Selecting variables using single-antecedent Association Rules
In this recipe we will identify and select variables to include as model inputs using the Apriori Association Rules node. We will select the top 24 predictors based on Association Rules variable selection. We will use the same KDD Cup 1998
data set, but this version of the data was prepared with the stream Recipe - variable selection apriori data prep.str
to create quintile versions of continuous variables. The target variable is the top quintile in donation amounts, TARGET_D between $20 and $200.
Getting ready
This recipe uses the datafile cup98lrn_reduced_vars3_apriori.sav
and the stream Recipe - variable selection apriori.str
.
You will need a copy of Microsoft Excel to visualize the list of rules.
How to do it...
To identify and select variables to include as model inputs using the Apriori Association Rules node:
- Open the stream
Recipe - variable selection apriori.str
by navigating to File | Open Stream. - Make sure the datafile points...