Implementation and optimizations
scikit-learn implements the LogisticRegression
class, which can solve this problem using optimized algorithms. Let's consider a toy dataset made of 500 samples:
The dots belong to the class 0, while the triangles belong to the class 1. In order to immediately test the accuracy of our classification, it's useful to split the dataset into training and test sets:
from sklearn.model_selection import train_test_split >>> X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.25)
Now we can train the model using the default parameters:
from sklearn.linear_model import LogisticRegression >>> lr = LogisticRegression() >>> lr.fit(X_train, Y_train) LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True, intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1, penalty='l2', random_state=None, solver='liblinear', tol=0.0001, verbose=0, warm_start=False) >>> lr...