You're reading from Mastering Predictive Analytics with R, Second Edition Machine learning techniques for advanced models

Product type Paperback

Published in Aug 2017

Publisher Packt

ISBN-13 9781787121393

Length 448 pages

Edition 2nd Edition

Languages

Concepts

Data Analysis

Authors (2):

James D. Miller

Rui Miguel Forte

View More author details

Table of Contents (16) Chapters

Preface

1. Gearing Up for Predictive Modeling FREE CHAPTER

2. Tidying Data and Measuring Performance

3. Linear Regression

4. Generalized Linear Models

5. Neural Networks

6. Support Vector Machines

7. Tree-Based Methods

8. Dimensionality Reduction

9. Ensemble Methods

10. Probabilistic Graphical Models

11. Topic Modeling

12. Recommendation Systems

13. Scaling Up

14. Deep Learning

Index

What this book covers

Chapter 1, Gearing Up for Predictive Modeling, helps you set up and get ready to start looking at individual models and case studies, then describes the process of predictive modeling in a series of steps, and introduces several fundamental distinctions.

Chapter 2, Tidying Data and Measuring Performance, covers performance metrics, learning curves, and a process for tidying data.

Chapter 3, Linear Regression, explains the classic starting point for predictive modeling; it starts from the simplest single variable model and moves on to multiple regression, over-fitting, regularization, and describes regularized extensions of linear regression.

Chapter 4, Generalized Linear Models, follows on from linear regression, and in this chapter, introduces logistic regression as a form of binary classification, extends this to multinomial logistic regression, and uses these as a platform to present the concepts of sensitivity and specificity.

Chapter 5, Neural Networks, explains that the model of logistic regression can be seen as a single layer perceptron. This chapter discusses neural networks as an extension of this idea, along with their origins and explores their power.

Chapter 6, Support Vector Machines, covers a method of transforming data into a different space using a kernel function and as an attempt to find a decision line that maximizes the margin between the classes.

Chapter 7, Tree-Based Methods, presents various tree-based methods that are popularly used, such as decision trees and the famous C5.0 algorithm. Regression trees are also covered, as well as random forests, making the link with the previous chapter on bagging. Cross validation methods for evaluating predictors are presented in the context of these tree-based methods.

Chapter 8, Dimensionality Reduction, covers PCA, ICA, Factor analysis, and Non-negative Matrix factorization.

Chapter 9, Ensemble Methods, discusses methods for combining either many predictors, or multiple trained versions of the same predictor. This chapter introduces the important notions of bagging and boosting and how to use the AdaBoost algorithm to improve performance on one of the previously analyzed datasets using a single classifier.

Chapter 10, Probabilistic Graphical Models, introduces the Naive Bayes classifier as the simplest graphical model following a discussion of conditional probability and Bayes' rule. The Naive Bayes classifier is showcased in the context of sentiment analysis. Hidden Markov Models are also introduced and demonstrated through the task of next word prediction.

Chapter 11, Topic Modeling, provides step-by-step instructions for making predictions on topic models. It will also demonstrate methods of dimensionality reduction to summarize and simplify the data.

Chapter 12, Recommendation Systems, explores different approaches to building recommender systems in R, using nearest neighbor approaches, clustering, and algorithms such as collaborative filtering.

Chapter 13, Scaling Up, explains working with very large datasets, including some worked examples of how to train some models we've seen so far with very large datasets.

Chapter 14, Deep Learning, tackles the really important topic of deep learning using examples such as word embedding and recurrent neural networks (RNNs).