Here is something we haven't found too easy: the Pima diabetes classification. Like caret, you can build ensemble models, so let's give that a try. I will also show how to incorporate SMOTE into the learning process instead of creating a separate dataset.
First, make sure you run the code from the beginning of this chapter to create the train and test sets. I'll pause here and let you take care of that.
Great, now let's create the training task as before:
> pima.task <- makeClassifTask(id = "pima", data = train, target =
"type")
The smote() function here is a little different from what we did before. You just have to specify the rate of minority oversample and the k-nearest neighbors. We will double our minority class (Yes) based on the three nearest neighbors:
> pima.smote <- smote(pima.task, rate = 2, nn = 3)
> str...