Using decision trees for regression
As we have discussed in the preceding chapters, regression is a supervised learning technique. As discussed in Chapter 11, Classification Analysis, the goal of supervised learning is to take a labeled dataset (for example, a dataset that has features of houses and their sales price – the dependent variable) and distill the knowledge in this data into an artifact known as a trained model. This trained model can then be used to predict the sales prices of houses that the model has not previously seen. When the dependent variable that we are trying to predict is a continuous variable, as opposed to a discrete variable, which is the domain of classification, we are dealing with regression.
Regression – the task of distilling the information presented in real-world observations or data – is a field of machine learning that encompasses techniques far broader than the decision tree technique that is used in Elasticsearch's...