We'll be developing a very simple, decision tree-based classifier, using the weka.classifiers package. For decision tree classification, we'll use the J48 algorithm, which is a very popular algorithm. To develop a classifier, we'll set two flags, as follows:
- -C: Sets the confidence threshold for pruning. Its default value is 0.25.
- -M: Sets the maximum number of instances for developing a decision tree classifier. Its default value is 2.
All of the other classifiers can be developed based on similar methods, which we'll incorporate while developing our decision tree classifier. We'll develop one more classifier—a Naive Bayes classifier—based on the same mechanism that we will follow to develop our decision tree classifier.
Let's get to the code and see how to do it. We'll start by importing the following...