Measuring the impact of a feature on the outcome
For this exercise, we are fitting the training data to six different models' classes: decision trees, gradient boosting trees, random forest, logistic regression, multi-layer perceptron, and Linear Discriminant Analysis (LDA). We learned about the first five in Chapter 3, Interpretation Challenges, so we will take a moment to familiarize ourselves with the last one, detailed here:
lda
: LDA is a very versatile method. It makes some of the same assumptions that linear regression has about normality and homoscedasticity; however, it stems from dimensionality reduction and is closely related to the Principal Component Analysis (PCA) unsupervised method. What it does is compute the distance between the mean of different classes, called between-class variance, and the variance within each class, called within-class variance. Then, it projects the data to a lower-dimensional space in such a way that it maximizes the distances between...