Decision trees are one of the oldest and more widely used methods of machine learning in commerce. What makes them popular is not only their ability to deal with more complex partitioning and segmentation (they are more flexible than linear models) but also their ability to explain how we arrived at a solution and as to "why" the outcome is predicated or classified as a class/label.
Apache Spark provides a good mix of decision tree based algorithms fully capable of taking advantage of parallelism in Spark. The implementation ranges from the straight forward Single Decision Tree (the CART type algorithm) to Ensemble Trees, such as Random Forest Trees and GBT (Gradient Boosted Tree). They all have both the variant flavors to facilitate classification (for example, categorical, such as height = short/tall) or regression (for example, continuous, such as height...