Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On Ensemble Learning with R

You're reading from   Hands-On Ensemble Learning with R A beginner's guide to combining the power of machine learning algorithms using ensemble techniques

Arrow left icon
Product type Paperback
Published in Jul 2018
Publisher Packt
ISBN-13 9781788624145
Length 376 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Prabhanjan Narayanachar Tattar Prabhanjan Narayanachar Tattar
Author Profile Icon Prabhanjan Narayanachar Tattar
Prabhanjan Narayanachar Tattar
Arrow right icon
View More author details
Toc

Table of Contents (15) Chapters Close

Preface 1. Introduction to Ensemble Techniques FREE CHAPTER 2. Bootstrapping 3. Bagging 4. Random Forests 5. The Bare Bones Boosting Algorithms 6. Boosting Refinements 7. The General Ensemble Technique 8. Ensemble Diagnostics 9. Ensembling Regression Models 10. Ensembling Survival Models 11. Ensembling Time Series Models 12. What's Next?
A. Bibliography Index

What this book covers

Chapter 1, Introduction to Ensemble Techniques, will give an exposition to the need for ensemble learning, important datasets, essential statistical and machine learning models, and important statistical tests. This chapter displays the spirit of the book.

Chapter 2, Bootstrapping, will introduce the two important concepts of jackknife and bootstrap. The chapter will help you carry out statistical inference related to unknown complex parameters. Bootstrapping of essential statistical models, such as linear regression, survival, and time series, is illustrated through R programs. More importantly, it lays the basis for resampling techniques that forms the core of ensemble methods.

Chapter 3, Bagging, will propose the first ensemble method of using a decision tree as a base model. Bagging is a combination of the words bootstrap aggregation. Pruning of decision trees is illustrated, and it will lay down the required foundation for later chapters. Bagging of decision trees and k-NN classifiers are illustrated in this chapter.

Chapter 4, Random Forests, will discuss the important ensemble extension of decision trees. Variable importance and proximity plots are two important components of random forests, and we carry out the related computations about them. The nuances of random forests are explained in depth. Comparison with the bagging method, missing data imputation, and clustering with random forests are also dealt with in this chapter.

Chapter 5, The Bare-Bones Boosting Algorithms, will first state the boosting algorithm. Using toy data, the chapter will then explain the detailed computations of the adaptive boosting algorithm. Gradient boosting algorithm is then illustrated for the regression problem. The use of the gbm and adabag packages shows implementations of other boosting algorithms. The chapter concludes with a comparison of the bagging, random forest, and boosting methods.

Chapter 6, Boosting Refinements, will begin with an explanation of the working of the boosting technique. The gradient boosting algorithm is then extended to count and survival datasets. The extreme gradient boosting implementation of the popular gradient boosting algorithm details are exhibited with clear programs. The chapter concludes with an outline of the important h2o package.

Chapter 7, The General Ensemble Technique, will study the probabilistic reasons for the success of the ensemble technique. The success of the ensemble is explained for classification and regression problems.

Chapter 8, Ensemble Diagnostics, will examine the conditions for the diversity of an ensemble. Pairwise comparisons of classifiers and overall interrater agreement measures are illustrated here.

Chapter 9, Ensembling Regression Models, will discuss in detail the use of ensemble methods in regression problems. A complex housing dataset from kaggle is used here. The regression data is modeled with multiple base learners. Bagging, random forest, boosting, and stacking are all illustrated for the regression data.

Chapter 10, Ensembling Survival Models, is where survival data is taken up. Survival analysis concepts are developed in considerable detail, and the traditional techniques are illustrated. The machine learning method of a survival tree is introduced, and then we build the ensemble method of random survival forests for this data structure.

Chapter 11, Ensembling Time Series Models, deals with another specialized data structure in which observations are dependent on each other. The core concepts of time series and the essential related models are developed. Bagging of a specialized time series model is presented, and we conclude the chapter with an ensemble of heterogeneous time series models.

Chapter 12, What's Next?, will discuss some of the unresolved topics in ensemble learning and the scope for future work.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime