Varying the classification threshold in logistic regression
Getting ready
We will use the fact that underlying the logistic regression classification, there is regression to minimize the number of times people were sent home for not having diabetes although they do. Do so by calling the predict_proba() method of the estimator:
y_pred_proba = lr.predict_proba(X_test)
This yields an array of probabilities. View the array:
y_pred_proba
array([[ 0.87110309, 0.12889691],
[ 0.83996356, 0.16003644],
[ 0.81821721, 0.18178279],
[ 0.73973464, 0.26026536],
[ 0.80392034, 0.19607966], ...
In the first row, a probability of about 0.87 is assigned to class 0 and a probability of 0.13 is assigned to 1. Note that...