One last thing worth noting is how logistic regression algorithms deal with multiclass classification. Although we interact with the scikit-learn classifiers in multiclass cases the same way as in binary cases, it is encouraging to understand how logistic regression works in multiclass classification.
Logistic regression for more than two classes is also called multinomial logistic regression, or better known latterly as softmax regression. As we have seen in the binary case, the model is represented by one weight vector w, the probability of the target being 1 or the positive class is written as follows:
In the K class case, the model is represented by K weight vectors, w1, w2, …, wK, and the probability of the target being class k is written as follows:
Note that the term normalizes probabilities (k from 1 to K) so that they total...