Working with k-means clustering
Let's look at how to build a clustering model. We'll be building an unsupervised model using k-means clustering.
We will use the Instances
class and the DataSource
class, just as we did in previous chapters. Since we are working with clustering, we will use the weka.clusterers
package to import the SimpleKMeans
class, as follows:
import weka.core.Instances; import weka.core.converters.ConverterUtils.DataSource; import weka.clusterers.SimpleKMeans;
First, we'll read our ARFF file into a dataset object, and we'll assign it to an Instances
object. Now, since this is all we have to do (in classification we had to also assign the target variable, the class attribute), we have to tell Weka what the class attribute is, then we will create an object for our k-means clustering. First, we have to tell Weka how many clusters we want to create. Let's suppose that we want to create three clusters. We'll take our k-means object and set setNumClusters
to 3
; then, we'll build...