Modeling and evaluation
We will start by mining the data for the overall association rules before moving on to our rules for beer, specifically. Throughout the modeling process, we will use the apriori algorithm, which is the appropriately named apriori()
function in the arules
package. The two main things that we will need to specify in the function is the dataset and parameters. As for the parameters, you will need to apply judgment at specifying the minimum support and confidence and the minimum and/or maximum length of basket items in an itemset. Using the item frequency plots along with trial and error, let's set the minimum support at 1 in 1,000 transactions and minimum confidence at 90 percent. Additionally, let's establish the maximum number of items to be associated as four. The following is the code to create the object that we will call rules
:
> rules = apriori(Groceries, parameter = list(supp = 0.001, conf = 0.9, maxlen=4))
Calling the object shows up how many rules the algorithm...