Creating n-grams from a list
An n-gram is a sequence of n items that occur adjacently. For example, in the following sequence of number [1, 2, 5, 3, 2], a possible 3-gram is [5, 3, 2].
n-grams are useful in computing probability tables to predict the next item. In this recipe, we will be creating all possible n-grams from a list of items. A Markov chain can easily be trained by using n-gram computation from this recipe.
How to do it…
Define the n-gram function as follows to produce all possible n-grams from a list:
ngram :: Int -> [a] -> [[a]] ngram n xs | n <= length xs = take n xs : ngram n (drop 1 xs) | otherwise = []
Test it out on a sample list as follows:
main = print $ ngram 3 "hello world"
The printed 3-gram is as follows:
["hel","ell","llo","lo ","o w"," wo","wor","orl","rld"]