Higher-order discrete Markov processes
First-order discrete Markov processes are extremely flexible, but there may be situations where we feel that the assumption that the transition probabilities depend only on the current state is an unrealistic one. There are plenty of situations where what happens next depends on more than just the preceding state in history. Take a sequence of words, for example. It would be a crude model that said that the probability of the next word in this sentence only depended on the immediately preceding word – see point 2 in the Notes and further reading section at the end of the chapter.
Can we improve upon this simple assumption? Can we make our transition probabilities depend upon longer stretches of history? Yes, we can. We can take the simplest case of the transition probabilities depending not only on the current state but also on the preceding state. This means the probability of a state depends upon the previous two states. For obvious...