Packt+ | Advance your knowledge in tech

You're reading from Applied Supervised Learning with R Use machine learning libraries of R to build models that solve business problems and predict future trends

Product type Paperback

Published in May 2019

Publisher

ISBN-13 9781838556334

Length 502 pages

Edition 1st Edition

Languages

Concepts

Machine Learning

Authors (2):

Jojo Moolayil

Karthik Ramasubramanian

Applied Supervised Learning with R

Preface

1. R for Advanced Analytics FREE CHAPTER

2. Exploratory Analysis of Data

3. Introduction to Supervised Learning

4. Regression

5. Classification

6. Feature Selection and Dimensionality Reduction

7. Model Improvements

8. Model Deployment

9. Capstone Project - Based on Research Papers

Appendix

Load the required packages mlbench, caret, and dplyr for the exercise:
```
library(mlbench)
library(dplyr)
library(caret)
```
Load the PimaIndianDiabetes dataset into memory from mlbench package:
```
data(PimaIndiansDiabetes)
df<-PimaIndiansDiabetes
```
Set a seed value as 2019 for reproducibility:
```
set.seed(2019)
```
Define the K-Fold validation object using the trainControl function from the caret package and define method as repeatedcv instead of cv. Define an additional construct in the trainControl function for the number of repeats in the validation repeats = 10:
```
train_control = trainControl(method = "repeatedcv",  number=5, repeats = 10,   savePredictions = TRUE,verboseIter = TRUE)
```
Define the grid for hyperparameter mtry of random forest model as (3,4,5):
```
parameter_values = expand.grid(mtry=c(3,4,5))
```

Fit the model with the grid values, cross-validation object, and random forest classifier:

model_rf_kfold<- train(diabetes~., data=df, trControl=train_control, method="rf",  metric= "Accuracy", tuneGrid = parameter_values)

Study the model performance by printing the average accuracy and standard deviation of accuracy:

print(paste("Average Accuracy :",mean(model_rf_kfold$resample$Accuracy)))
print(paste("Std. Dev Accuracy :",sd(model_rf_kfold$resample$Accuracy)))