Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On Ensemble Learning with R

You're reading from  Hands-On Ensemble Learning with R

Product type Book
Published in Jul 2018
Publisher Packt
ISBN-13 9781788624145
Pages 376 pages
Edition 1st Edition
Languages
Author (1):
Prabhanjan Narayanachar Tattar Prabhanjan Narayanachar Tattar
Profile icon Prabhanjan Narayanachar Tattar
Toc

Table of Contents (17) Chapters close

Hands-On Ensemble Learning with R
Contributors
Preface
1. Introduction to Ensemble Techniques 2. Bootstrapping 3. Bagging 4. Random Forests 5. The Bare Bones Boosting Algorithms 6. Boosting Refinements 7. The General Ensemble Technique 8. Ensemble Diagnostics 9. Ensembling Regression Models 10. Ensembling Survival Models 11. Ensembling Time Series Models 12. What's Next?
Bibliography Index

Proximity plots


According to Hastie, et al. (2009), "one of the advertised outputs of a random forest is a proximity plot" (see page 595). But what are proximity plots? If we have n observations in the training dataset, a proximity matrix of order is created. Here, the matrix is initialized with all the values at 0. Whenever a pair of observations such as OOB occur jointly in the terminal node of a tree, the proximity count is increased by 1. The proximity matrix is visualized using the multidimensional scaling method, a concept beyond the scope of this chapter, where the proximity matrix is represented in two dimensions. The proximity plots give an indication of which points are closer to each other from the perspective of the random forest.

In the earlier creation of random forests, we had not specified the option of a proximity matrix. Thus, we will first create the random forest using the option of proximity as follows:

> GC2_RF3 <- randomForest(GC2_Formula,data=GC2_Train,
+   ...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime