The Seventh of the Seven C’s - Chance
Measuring chance is actually a measure of the predictability of the data. While this one may sound the scariest, it is actually one of the best ways to know if the data is “good enough”. The reason is pretty simple. Each of the other six C’s can be measured for each attribute. Chance can be used to gauge how valuable the data is as a whole. Thankfully, it is pretty easy to measure by using something called a confusion matrix.
Event happens |
Event doesn’t happen |
|
You predict an event |
You got it right (True positive) |
You guessed wrong (False positive) |
You predict a non-event |
You guessed wrong (False negative) |
You got it right (True negative) |