With a solid understanding of partitioning evaluation metrics, let's practice the CART tree algorithm by hand on a toy dataset:
To begin, we decide on the first splitting point, the root, by trying out all possible values for each of the two features. We utilize the weighted_impurity function we just defined to calculate the weighted Gini Impurity for each possible combination as follows:
Gini(interest, tech) = weighted_impurity([[1, 1, 0], [0, 0, 0, 1]]) = 0.405
Gini(interest, Fashion) = weighted_impurity([[0, 0], [1, 0, 1, 0, 1]]) = 0.343
Gini(interest, Sports) = weighted_impurity([[0, 1], [1, 0, 0, 1, 0]]) = 0.486
Gini(occupation, professional) = weighted_impurity([[0, 0, 1, 0], [1, 0, 1]]) = 0.405
Gini(occupation, student) = weighted_impurity([[0, 0, 1, 0], [1, 0, 1]]) = 0.405
Gini(occupation, retired) = weighted_impurity([[1...