Evaluation
Unfortunately, the functionality for evaluating model quality in the pipeline API remains limited, as of version 1.5.2. Logistic regression does output a summary containing several evaluation metrics (available through the summary
attribute on the trained model), but these are calculated on the training set. In general, we want to evaluate the performance of the model both on the training set and on a separate test set. We will therefore dive down to the underlying MLlib layer to access evaluation metrics.
MLlib provides a module, org.apache.spark.mllib.evaluation
, with a set of classes for assessing the quality of a model. We will use the BinaryClassificationMetrics
class here, since spam classification is a binary classification problem. Other evaluation classes provide metrics for multi-class models, regression models and ranking models.
As in the previous section, we will illustrate the concepts in the shell, but you will find analogous code in the ROC.scala
script in the code...