One of the nice features of logistic regression is that it offers predictor coefficients that can tell us the relative importance of the predictor variables or features. For categorical features, a positive sign on a feature's coefficient tells us that, when present, this feature increases the probability of a positive outcome versus the baseline. For continuous features, a positive sign tells us that an increase in the value of a feature corresponds to an increase in the probability of a positive outcome. The size of the coefficient tells us the magnitude of the increase in probability.
Let's generate the importance of the feature from our model, and then we can examine the impact it has:
fv = pd.DataFrame(X_train.columns, clf.coef_.T).reset_index() fv.columns = ['Coef', 'Feature'] fv.sort_values...