Model performance
Let’s use Redshift SQL to compute a confusion matrix to evaluate the performance of the classification model. Using a confusion matrix, you can identify true positives, true negatives, false positives, and false negatives, based on which various statistical measures such as accuracy, precision, recall, sensitivity, specificity, and finally, F1 score are calculated. You can read more about the concept of the confusion matrix here: https://en.wikipedia.org/wiki/Confusion_matrix.
The following query uses a WITH
clause, which implements a common table expression in Redshift. This query has the following three parts:
- The first part is about the
SELECT
statement within theWITH
clause, where we predict customer churn and save it in memory. This dataset is namedinfer_data
. - The second part, which is below the first
SELECT
statement, readsinfer_data
and builds the confusion matrix, and these details are stored in memory in a dataset calledconfusionmatrix...