Summary
In this final chapter, we have explored multiple linear regression, logistic regression, random forest regression, and neural network techniques applied to datasets resulting from real cases. We started from a random forest regression for the Boston dataset to predict the median value of owner-occupied homes for the test data. The random forests algorithm is based on the construction of many regression trees. Every single case is passed through all the trees in the forest; each of them provides a prediction. The final forecast is then made by averaging the predictions provided by individual regression trees. In accordance with what has been said, the tree response is an estimate of the dependent variable given the predictors.
Then, we have used a logistic regression technique to classify breast cancer. Logistic regression is a special case of a generalized linear model having as a link
function the logit
function. This is a regression model applied in cases where the dependent variable...