Introduction to binary classifiers
Without getting too deep into machine learning terminology, our previous example of a cancer test is what is known as a binary classifier, which means that it is trying to predict from only two options: having cancer or not having cancer. When we are dealing with binary classifiers, we can draw what is called confusion matrices, which are 2 x 2 matrices that house all four possible outcomes of our experiment.
Let’s try some different numbers. Let’s say 165 people walked in for the study. So, our n (sample size) is 165 people. All 165 people are given the test and asked whether they have cancer (provided through various other means). The following confusion matrix shows us the results of this experiment:
Figure 5.8 – Confusion matrix
The matrix shows that 50 people were predicted to not have cancer and did not have it, 100 people were predicted to have cancer and actually did have it, and so on...