Model evaluation
As mentioned before, model evaluation is built-in to ApacheSparkML and you'll find all that you need in the org.apache.spark.ml.evaluation
package. Let's continue with our binary classification. This means that we'll have to use org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
:
import org.apache.spark.ml.evaluation.BinaryClassificationEvaluator val evaluator = new BinaryClassificationEvaluator() import org.apache.spark.ml.param.ParamMap var evaluatorParamMap = ParamMap(evaluator.metricName -> "areaUnderROC") var aucTraining = evaluator.evaluate(result, evaluatorParamMap)
To code previous initialized a BinaryClassificationEvaluator
function and tells it to calculate the areaUnderROC
, one of the many possible metrics to assess the prediction performance of a machine learning algorithm.
As we have the actual label and the prediction present in a DataFrame called result
, it is simple to calculate this score and is done using the following line of code:
var aucTraining...