Decision trees
Decision trees is a commonly used technique in data mining to create a model that predicts the value of a target (or dependent variable) based on the values of several input (or independent variables). There is a variety of decision tree algorithms available with small changes here and there. We will be using a very popular version of a decision tree called Classification and Regression Trees (CART). It was introduced in 1984 by Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone as an umbrella term to refer to classification and regression types of decision trees. Using decision trees, we can predict either a categorical variable or continuous variable. Based on the type of dependent variable, we use a regression tree (for a continuous outcome variable) or classification tree (for a categorical outcome). The CART has a small variation in the internal working of the algorithm. For our current exercise, we will be using regression trees. Later, we'll look...