Summary
In this chapter, we discussed what generative modeling is, and how it fits into the landscape of more familiar machine learning methods, using probability theory and Bayes’ theorem to describe how these models approach prediction in an opposite manner to discriminative learning. We reviewed use cases for generative learning, both for specific kinds of data and general prediction tasks. As we saw, text and images are the two major forms of data that these models are applied to. For images, the major models we will discuss are VAE, GAN, and similar algorithms. For text, the dominant models are transformer architectures such as Llama, GPT, and BERT. Finally, we examined some of the specialized challenges that arise from building these models.