Summary
In this chapter, we learned how to generate frequent itemsets from a dataset using the Apriori algorithm. We then proposed association rules from these itemsets by describing their support and confidence. We used one additional check, an added value measure, to ensure that the proposed rules were interesting. We implemented all these concepts using a freely available dataset of Freecode open source projects and their tags. We calculated support for single tags, then generated doubletons and tripletons that met a minimum support threshold. For rules with one item on the right-hand side, we calculated confidence and added value for each. Finally, we looked closely at the rules that were generated and tried to figure out which ones were interesting, using the metrics we had calculated.
In the next chapter, we will continue our quest to make connections between items in a data set. However, unlike in this chapter where we were trying to find groups of two or three items that are already...