What is generative modeling?
In statistics, there is a large field called generative probabilistic modeling that contrasts with discriminative modeling. You may have used discriminative modeling either knowingly or unknowingly. So, let’s start by defining discriminative modeling.
Discriminative modeling
If you build a statistical model such as a regression model, you are already using the discriminative modeling approach. Let’s use logistic regression, Y = a + bX, as an example. The parameters, a, b, are to be estimated. Y = a + bX means “given the parameters, a, b, what is the prediction when the value of X is x?” or p(X = x|a, b). This discriminative modeling process applies to any classification modeling, such as decision trees, random forest, gradient boosting, and others. Formally, in a discriminative process, we do the following:
- Assume some functional form for p(Y | X)
- Estimate parameters of p(Y | X) from the training data