Summary
The study of the combination of two concepts: Markov processes and latent variables or states can be overwhelming at times. The implementation of the hidden Markov model, for instance, is particularly challenging for engineers with limited exposure to dynamic programming techniques.
In this chapter, you learned about the Markov processes, the generative HMM to maximize the disjoint probability, p(X, Y), and the discriminative CRF to maximize log of the condition probability, p(Y|X).
Markov decision processes are conceptually also used in reinforcement learning; see: Chapter 15, Reinforcement Learning.
HMM is a special form of Bayes Network: It requires the observations to be independent. Although restrictive, the conditional independence pre-requisite makes the HMM easy to understand and validate, which is not the case for CRF. As a side note, recurrent neural networks are an alternative to HMM for predicting state given a sequence of observations.
The conditional random fields estimate...