Gaussian Mixture
In Chapter 3, Introduction to Semi-Supervised Learning, we discussed the Generative Gaussian Mixture model in the context of semi-supervised learning. In this section, we're going to apply the EM algorithm to derive the formulas for the parameter updates.
Let's start considering a dataset X, drawn from a data-generating process pdata:

We assume that the whole distribution is generated by the sum of k Gaussian distributions so that the probability of each sample can be expressed as follows:

In the previous expression, the term wj = P(N = j) is the relative weight of the jth Gaussian, while are the mean and the covariance matrix. For consistency with the laws of probability, we also need to impose the following:

Unfortunately, if we try to solve the problem directly, we need to manage the logarithm of a sum and the procedure becomes very complex. However, we have learned that it's possible to use latent variables as helpers whenever...