Packt+ | Advance your knowledge in tech

You're reading from Applied Supervised Learning with R Use machine learning libraries of R to build models that solve business problems and predict future trends

Product type Paperback

Published in May 2019

Publisher

ISBN-13 9781838556334

Length 502 pages

Edition 1st Edition

Languages

Concepts

Machine Learning

Authors (2):

Jojo Moolayil

Karthik Ramasubramanian

View More author details

Table of Contents (12) Chapters

Applied Supervised Learning with R

Preface

1. R for Advanced Analytics FREE CHAPTER

2. Exploratory Analysis of Data

3. Introduction to Supervised Learning

4. Regression

5. Classification

6. Feature Selection and Dimensionality Reduction

7. Model Improvements

8. Model Deployment

9. Capstone Project - Based on Research Papers

Appendix

Chapter 6: Feature Selection and Dimensionality Reduction

Activity 11: Converting the CBWD Feature of the Beijing PM2.5 Dataset into One-Hot Encoded Columns

Read the Beijing PM2.5 dataset into the DataFrame PM25:
```
PM25 <- read.csv("PRSA_data_2010.1.1-2014.12.31.csv")
```
Create a variable cbwd_one_hot for storing the result of the dummyVars function with ~ cbwd as its first argument:
```
library(caret)
cbwd_one_hot <- dummyVars(" ~ cbwd", data = PM25) 
```
Use the output of the predict() function on cbwd_one_hot and case it as DataFrame:
```
cbwd_one_hot <- data.frame(predict(cbwd_one_hot, newdata = PM25))
```
Remove the original cbwd variable from the PM25 DataFrame:
```
PM25$cbwd <- NULL
```
Using the cbind() function, add cbwd_one_hot to the PM25 DataFrame:
```
PM25 <- cbind(PM25, cbwd_one_hot)
```

Print the top 6 rows of PM25:

head(PM25)

The output of the previous command is as follows:

##   No year month day hour pm2.5 DEWP TEMP PRES   Iws Is Ir cbwd.cv cbwd.NE
## 1  1 2010     1   1    0    NA  -21  -11 1021  1.79  0  0       0       0
## 2  2 2010     1   1    1    NA  -21  -12 1020  4.92  0  0       0       0
## 3  3 2010     1   1    2    NA  -21  -11 1019  6.71  0  0       0       0
## 4  4 2010     1   1    3    NA  -21  -14 1019  9.84  0  0       0       0
## 5  5 2010     1   1    4    NA  -20  -12 1018 12.97  0  0       0       0
## 6  6 2010     1   1    5    NA  -19  -10 1017 16.10  0  0       0       0
##   cbwd.NW cbwd.SE
## 1       1       0
## 2       1       0
## 3       1       0
## 4       1       0
## 5       1       0
## 6       1       0

Observe the variable cbwd in the output of the head(PM25) command: it is now transformed into one-hot encoded columns with the NE, NW, and SE suffixes.

The rest of the chapter is locked

You're reading from Applied Supervised Learning with R Use machine learning libraries of R to build models that solve business problems and predict future trends

Table of Contents (12) Chapters

Chapter 6: Feature Selection and Dimensionality Reduction

Activity 11: Converting the CBWD Feature of the Beijing PM2.5 Dataset into One-Hot Encoded Columns

Authors (2)

Other recommended products

Personalised recommendations for you

You're reading from Applied Supervised Learning with R Use machine learning libraries of R to build models that solve business problems and predict future trends

Table of Contents (12) Chapters

Chapter 6: Feature Selection and Dimensionality Reduction

Activity 11: Converting the CBWD Feature of the Beijing PM2.5 Dataset into One-Hot Encoded Columns

Unlock this book and the full library FREE for 7 days

Authors (2)

Other recommended products

Personalised recommendations for you