Summary
In this chapter, we presented the EM algorithm, explaining the reasons that justify its application in many statistical learning contexts. We also discussed the fundamental role of hidden (latent) variables in order to derive an expression that is easier to maximize (the Q function).
We applied the EM algorithm to solve a simple parameter estimation problem and afterward to prove the Gaussian Mixture estimation formulas. We showed how it's possible to employ the scikit-learn implementation instead of writing the whole procedure from scratch (like in Chapter 3, Introduction to Semi-Supervised Learning).
In the next chapter, we are going to introduce and analyze three different approaches to component extraction: Factor Analysis, PCA, and FastICA.