Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Why you should NEVER run a Logistic Regression (unless you have to) from Featured Blog Posts - Data Science Central

Save for later
  • 2 min read
  • 14 Oct 2020

article-image

Hello fellow Data Science-Centralists!

I wrote a post on my LinkedIn about why you should NEVER run a Logistic Regression. (Unless you really have to).

The main thrust is:

  • There is no theoretical reason why a least squares estimator can't work on a 0/1.
  • There are very very narrow theoretical reasons that you want to run a logistic, and unless you fall into those categories it's not worth the time.
  • The run time of a logistic can be up to 100x longer than an OLS model. If you are doing v-fold cross-validation save yourself some time.
  • The XB's are exactly the same whether you use a Logistic or a linear regression. The model specification (features, feature engineering, feature selection, interaction terms) are identical -- and this is what you should be focused on anyways.
  • Myth: Linear regression can only run linear models.
  • There is *one* practical reason to run a logistic: if the results are all very close to 0 or to 1, and you can't hard code your prediction to 0 or 1 if the linear models falls outside a normal probability range, then use the logistic. So if you are pricing an insurance policy based on risk, you can't have a hard-coded 0.000% prediction because you can't price that correctly.

See video here and slides here.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime

I think it'd be nice to start a debate on this topic!