Performance consideration
Data scientists should consider the computation cost during the evaluation of a sequential data model. The time complexity for decoding and evaluating canonical forms of the hidden Markov model for N states and T observations is O(N 2 T). The training of HMM using the Baum-Welch algorithm is O(N 2 TM), where M is the number of iterations.
There are several options to improve the performance of HMM:
Avoid unnecessary multiplication by 0 in the emission probabilities matrix by either using sparse matrices or tracking the null entries.
Train HMM on the most relevant subset of the training data. This technique can be particularly effective in the case of tagging of words and bag or words in natural language processing.
The training of the linear chain conditional random fields is implemented using the same dynamic programming techniques as HMM implementation (Viterbi, forward-backward passes). Its time complexity for training T data sequence, N labels (or expected outcome...