Examining logistic regression errors with a confusion matrix
Getting ready
Import and view the confusion matrix for the logistic regression we constructed:
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test, y_pred,labels = [1,0])
array([[27, 27],
[12, 88]])
I passed three arguments to the confusion matrix:
- y_test: The test target set
- y_pred: Our logistic regression predictions
- labels: References to a positive class
The labels = [1,0] means that the positive class is 1 and the negative class is 0. In the medical context, we found while exploring the Pima Indians diabetes dataset that class 1 tested positive for diabetes.
Here is the confusion matrix, again in pandas dataframe form: