Multinomial logit model
In practice, there are many situations where the outcomes (dependent variables) are not binary but have more than two possibilities. Multinomial logistic regression can be understood as a general case of the logit model, which we studied in the previous section. In this section, we will consider a hands-on study on Iris data by using the MNLogit
class from statsmodels
: https://www.statsmodels.org/dev/generated/statsmodels.discrete.discrete_model.MNLogit.html.
Iris data (https://archive.ics.uci.edu/ml/datasets/iris) is one of the best-known statistical and machine learning datasets for education. The independent variables are sepal length (in cm), sepal width (in cm), petal length (in cm), and petal width (in cm). The dependent variable is a categorical variable with three levels: Iris Setosa (0), Iris Versicolor (1), and Iris Virginia (2). The following Python codes illustrate how to conduct this using sklearn
and statsmodels
:
# import packages import...