Search icon CANCEL
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Regression Analysis with R

You're reading from  Regression Analysis with R

Product type Book
Published in Jan 2018
Publisher Packt
ISBN-13 9781788627306
Pages 422 pages
Edition 1st Edition
Languages
Author (1):
Giuseppe Ciaburro Giuseppe Ciaburro
Profile icon Giuseppe Ciaburro

Table of Contents (15) Chapters

Title Page
Packt Upsell
Contributors
Preface
1. Getting Started with Regression 2. Basic Concepts – Simple Linear Regression 3. More Than Just One Predictor – MLR 4. When the Response Falls into Two Categories – Logistic Regression 5. Data Preparation Using R Tools 6. Avoiding Overfitting Problems - Achieving Generalization 7. Going Further with Regression Models 8. Beyond Linearity – When Curving Is Much Better 9. Regression Analysis in Practice 1. Other Books You May Enjoy Index

Index

A

  • accuracy / Customer satisfaction analysis with the multiple logistic regression
  • action of regressing / Going back to the origin of regression
  • Akaike Information Criterion (AIC) / Simple logistic regression
  • apodixis / Going back to the origin of regression
  • area under curve (AUC) / Model fitting
  • artificial neural networks (ANN) / Regression with neural networks

B

  • BAS package
    • used, for creating Bayesian linear regression / Bayesian model using BAS package
  • batch GD / Stochastic Gradient Descent
  • Bayes' theorem / Bayes' theorem
  • Bayesian Information Criterion (BIC) / Bayesian model using BAS package
  • Bayesian linear regression
    • about / Bayesian linear regression
    • probability, basic concepts / Basic concepts of probability
    • Bayes' theorem / Bayes' theorem
    • creating, with BAS package / Bayesian model using BAS package
  • binning
    • using, for discretization / Data discretization by binning
  • BLR package
    • about / The BLR package
    • BLR / The BLR package
    • sets / The BLR package
  • Boston dataset
    • random forest regression / Random forest regression with the Boston dataset
  • Box-Cox power transformation / The MASS package
  • breast cancer
    • classifying, with logistic regression / Classifying breast cancer using logistic regression

C

  • caret package / The caret package
  • caret package, functions
    • train / The caret package
    • trainControl / The caret package
    • varImp / The caret package
    • defaultSummary / The caret package
    • knnreg / The caret package
    • plotObsVsPred / The caret package
    • predict.knnreg / The caret package
  • car package
    • ANOVA / The car package
    • linear.hypothesis / The car package
    • cookd / The car package
    • outlier.test / The car package
    • durbin.watson / The car package
    • levene.test / The car package
    • ncv.test / The car package
  • categorical data
    • multiple logistic regression / Multiple logistic regression with categorical data
  • categorical variables
    • about / Categorical variables
    • nominal variables / Categorical variables
    • dichotomous variables / Categorical variables
    • ordinal variables / Categorical variables
  • certain event / Basic concepts of probability
  • Classification and Regression Tree (CART) / Regression trees
  • classification tree / Regression trees
  • Comprehensive R Archive Network (CRAN)
    • URL / Installing R
  • conditional probability / Basic concepts of probability
  • Cook's distance / Diagnostic plots
  • correlation
    • versus regression / Regression versus correlation
    • about / Association between variables – covariance and correlation
  • count data model
    • about / Count data model
    • Poisson distributions / Poisson distributions
    • Poisson regression model / Poisson regression model
    • warp breaks per loom, modeling / Modeling the number of warp breaks per loom
  • covariance / Association between variables – covariance and correlation
  • cross-validation
    • used, for overfitting detection / Overfitting detection – cross-validation
    • k-fold / Overfitting detection – cross-validation
    • Leave-one-out cross-validation (LOOCV) / Overfitting detection – cross-validation
  • Cumulative Distribution / Multivariate Adaptive Regression Splines
  • customer satisfaction analysis
    • with multiple logistic regression / Customer satisfaction analysis with the multiple logistic regression

D

  • data scaling / Scale of features
  • data wrangling
    • about / Data wrangling
    • data, viewing / A first look at data
    • datatype, modifying / Change datatype
    • empty cells, removing / Removing empty cells
    • incorrect value, replacing / Replace incorrect value
    • missing values / Missing values           
    • NaN values / Treatment of NaN values
  • decision trees / Regression trees
  • deduction / Understanding regression concepts
  • Department of Transportation (DOT) / Creating a linear regression model
  • dependent / Regression versus correlation
  • diagnostic plots / Diagnostic plots
  • dimensionality reduction
    • about / Dimensionality reduction
    • principal component analysis / Principal Component Analysis
  • discretization
    • about / Discretization in R
    • by binning / Data discretization by binning
    • by histogram analysis / Data discretization by histogram analysis
  • dummy coding / Discovering different types of regression

E

  • Earth / Multivariate Adaptive Regression Splines
  • ElasticNet regression / ElasticNet regression
  • exponential family / Generalized Linear Model

F

  • feature scaling
    • about / Scale of features, Exploratory analysis
    • min-max normalization / Min–max normalization
    • z score standardization / z score standardization
  • feature selection
    • about / Feature selection
    • stepwise regression / Stepwise regression
    • regression subset selection / Regression subset selection
  • Federal Highway Administration (FHWA) / Creating a linear regression model
  • File Transfer Protocol (FTP) / Installing R
  • Fitted values / Diagnostic plots
  • Free Software Foundation's (FSF) / The R environment
  • Froude number / Stepwise regression

G

  • Galton universal regression law / Going back to the origin of regression
  • Generalized Additive Model (GAM) / Generalized Additive Model
  • Generalized Cross Validation (GCV) / Multivariate Adaptive Regression Splines
  • generalized least squares (GLS) / The MASS package, Robust linear regression
  • Generalized Linear Model (GLM)
    • about / The R stats package, Generalized Linear Model, Ridge regression, Modeling the number of warp breaks per loom
    • simple logistic regression / Simple logistic regression
  • Generalized Ridge Regression (GRR) / Generalized Additive Model
  • General Public License (GPL) / The R environment
  • glmnet package, function
    • glmnet / The glmnet package
    • glmnet.control / The glmnet package
    • predict.glmnet / The glmnet package
    • print.glmnet / The glmnet package
    • plot.glmnet / The glmnet package
    • deviance.glmnet / The glmnet package
  • globally convergent version (GRPROP) / Neural network model
  • Gradient Descent (GD) / Gradient Descent and linear regression, Gradient Descent

H

  • histogram analysis
    • using, for discretization / Data discretization by histogram analysis

I

  • impossible event / Basic concepts of probability
  • independent / Regression versus correlation
  • indicator variables / Discovering different types of regression
  • induction / Understanding regression concepts
  • inference / Understanding regression concepts
  • Integrated Development Environment (IDE) / RStudio
  • interquartile range (IQR) / Finding outliers in data
  • iteratively reweighted least squares (IRLS) / Robust linear regression

K

  • K-Nearest Neighbor (KNN) regression / The caret package

L

  • Lars package
    • about / The Lars package
    • lars / The Lars package
    • summary.lars / The Lars package
    • plot.lars / The Lars package
    • predict.lars / The Lars package
  • lasso regression / Lasso regression
  • least absolute deviations / Lasso regression
  • least absolute errors / Lasso regression
  • least squares / Lasso regression
  • least squares regression / Least squares regression
  • Leverage / Diagnostic plots
  • linear regression
    • with SGD / Linear regression with SGD
  • linear regression model
    • creating / Creating a linear regression model
    • statistical significance test / Statistical significance test
    • model results, exploring / Exploring model results
    • diagnostic plots / Diagnostic plots
    • about / Gradient Descent and linear regression, Robust linear regression
    • with SGD / Linear regression with SGD
  • linear relationships
    • searching / Searching linear relationships
  • lobules / Classifying breast cancer using logistic regression
  • log-Iteration / Linear regression with SGD
  • log-linear model / Count data model
  • log-odds / The logit model
  • logistic regression
    • about / Understanding logistic regression
    • logit model / The logit model
    • used, for classifying breast cancer / Classifying breast cancer using logistic regression
    • exploratory analysis / Exploratory analysis
    • model, fitting / Model fitting
  • logit / The logit model
  • Log Posterior Odds / Bayesian model using BAS package
  • Lowess curve / Multivariate Adaptive Regression Splines

M

  • marginal likelihood / Bayes' theorem
  • MARS model equation / Multivariate Adaptive Regression Splines
  • Mean Square Error (MSE) / Linear regression with SGD, Overfitting detection – cross-validation, Multivariate Adaptive Regression Splines, Multiple linear model fitting
  • milk ducts / Classifying breast cancer using logistic regression
  • min-max normalization / Min–max normalization
  • model
    • building / Building a model
  • model results
    • exploring / Exploring model results
  • Model Selection / Multivariate Adaptive Regression Splines
  • multicollinearity / Ridge regression
  • multinomial logistic regression / Multinomial logistic regression
  • multiple linear model
    • fitting / Multiple linear model fitting
  • Multiple Linear Regression (MLR) model
    • concepts / Multiple linear regression concepts
    • building / Building a multiple linear regression model
    • with categorical predictor / Multiple linear regression with categorical predictor
    • categorical variables / Categorical variables
    • model, building / Building a model
    • about / Bayesian linear regression
  • multiple logistic regression
    • about / Multiple logistic regression
    • customer satisfaction analysis / Customer satisfaction analysis with the multiple logistic regression
    • with customer satisfaction analysis / Customer satisfaction analysis with the multiple logistic regression
    • with categorical data / Multiple logistic regression with categorical data
  • Multivariate Adaptive Regression Splines (MARS) / Multivariate Adaptive Regression Splines
  • multivariate models / Discovering different types of regression
  • multivariate multiple regression / Discovering different types of regression

N

  • National Weather Service (NWS) / Min–max normalization
  • neural networks
    • using, for regression / Regression with neural networks
    • exploratory analysis / Exploratory analysis
    • about / Neural network model
  • nonlinear least squares / Nonlinear least squares
  • nonlinear least squares, arguments
    • about / Nonlinear least squares
    • formula / Nonlinear least squares
    • data / Nonlinear least squares
    • start / Nonlinear least squares
    • control / Nonlinear least squares
    • algorithm / Nonlinear least squares
    • trace / Nonlinear least squares
    • subset / Nonlinear least squares
    • weights / Nonlinear least squares
    • na.action / Nonlinear least squares
    • model / Nonlinear least squares
    • upper bounds / Nonlinear least squares
    • lower bounds / Nonlinear least squares
  • Normal Q-Q plot / Building a multiple linear regression model
  • Not a Number (NaN) / A first look at data
  • null hypothesis / Statistical significance test

O

  • Ordinary Least Squares (OLS) / Ridge regression, Robust linear regression, Nonlinear least squares
  • outliers
    • searching, in data / Finding outliers in data
    • about / Robust linear regression
  • overfitting
    • about / Understanding overfitting
    • detection, with cross-validation / Overfitting detection – cross-validation

P

  • partial / Multiple logistic regression
  • penalized quasi-likelihood (PQL) / The MASS package
  • perfect linear association
    • modeling / Modeling a perfect linear association
  • Poisson distributions / Poisson distributions
  • Poisson regression model / Poisson regression model
  • polynomial regression / Polynomial regression
  • Posterior Inclusion Probabilities (pip) / Bayesian model using BAS package
  • posterior probability / Bayes' theorem
  • precision / Customer satisfaction analysis with the multiple logistic regression
  • Principal Component Analysis (PCA) / Principal Component Analysis
  • principal components / Principal Component Analysis
  • prior probability / Bayes' theorem
  • probability
    • basic concepts / Basic concepts of probability

R

  • R
    • installing / Installing R
    • precompiled binary distribution, using / Using precompiled binary distribution
    • installing, on Windows / Installing on Windows
    • installing, on macOS / Installing on macOS
    • installing, on Linux / Installing on Linux
    • source code, installation / Installation from source code
  • random forest regression
    • with Boston dataset / Random forest regression with the Boston dataset
    • exploratory analysis / Exploratory analysis
    • multiple linear model, fitting / Multiple linear model fitting
    • about / Random forest regression model
  • Receiver Operator Characteristic (ROC) / Model fitting
  • regression
    • origin / Going back to the origin of regression
    • applications / Regression in the real world
    • about / Understanding regression concepts
    • versus correlation / Regression versus correlation
    • types / Discovering different types of regression
    • R packages, using / R packages for regression
    • with neural networks / Regression with neural networks
  • regression subset selection / Regression subset selection
  • regression towards mediocrity / Going back to the origin of regression
  • regression tree
    • about / Regression trees
    • splitting / Regression trees
    • pruning / Regression trees
    • tree selection / Regression trees
  • regularization
    • about / Regularization
    • ridge regression / Ridge regression
    • lasso regression / Lasso regression
    • ElasticNet regression / ElasticNet regression
  • R environment / The R environment
  • resampling
    • bootstrap resampling / Overfitting detection – cross-validation
  • Residual QQ / Multivariate Adaptive Regression Splines
  • Residuals / Diagnostic plots
  • Residual Sum of Squares (RSS) / Ridge regression, Multivariate Adaptive Regression Splines
  • Residuals vs Fitted values / Multivariate Adaptive Regression Splines
  • resilient backpropagation (RPROP) / Neural network model
  • ridge regression / Ridge regression
  • Road Casualties Great Britain (RCGB) / Ridge regression
  • Road Traffic Accidents (RTA) / Ridge regression
  • robust linear regression / Robust linear regression
  • Robust Regression Model / Robust linear regression
  • R packages
    • using, for regression / R packages for regression
    • R stats package / The R stats package
    • car package / The car package
    • caret package / The caret package
    • glmnet package / The glmnet package
    • sgd package / The sgd package
    • BLR package / The BLR package
    • Lars package / The Lars package
  • RStudio
    • about / RStudio
    • URL / RStudio

S

  • sensitivity / Model fitting
  • sgd package / The sgd package
  • sgd package, functions
    • sgd / The sgd package
    • print.sgd / The sgd package
    • predict.sgd / The sgd package
    • plot.sgd / The sgd package
  • simple logistic regression / Simple logistic regression
  • smallest absolute gradient (sag) / Neural network model
  • smallest learning rate (slr) / Neural network model
  • specificity / Customer satisfaction analysis with the multiple logistic regression, Model fitting
  • Standardized residuals / Diagnostic plots
  • standard score / z score standardization, Neural network model
  • statistical significance test / Statistical significance test
  • stepwise regression
    • about / Stepwise regression
    • forward method / Stepwise regression
    • backward method / Stepwise regression
    • stepwise method / Stepwise regression
  • stochastic / Stochastic Gradient Descent
  • Stochastic Gradient Descent (SGD) / The sgd package, Stochastic Gradient Descent
  • Support Vector Machine (SVM) / Support Vector Regression

T

  • Theoretical Quantiles / Diagnostic plots
  • TIOBE
    • URL / The R environment
  • True Negative Rate (TNR) / Customer satisfaction analysis with the multiple logistic regression, Model fitting
  • true positive rate (TPR) / Model fitting

U

  • UCI Machine Learning Repository
    • URL / Regression with neural networks

V

  • variables
    • relationships / Association between variables – covariance and correlation

Z

  • z score standardization / z score standardization, Neural network model
lock icon The rest of the chapter is locked
arrow left Previous Section
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}