Using a Markov chain to generate text
A Markov chain is a system that predicts future outcomes of a system given current conditions. We can train a Markov chain on a corpus of data to generate new text by following the states.
A graphical representation of a chain is shown in the following figure:
Getting ready
Install the markov-chain
library using cabal as follows:
$ cabal install markov-chain
Download a big corpus of text, and name it big.txt
. In this recipe, we will be using the text downloaded from http://norvig.com/big.txt.
How to do it…
Import the following packages:
import Data.MarkovChain import System.Random (mkStdGen)
Train a Markov chain on a big input of text and then run it as follows:
main = do rawText <- readFile "big.txt" let g = mkStdGen 100 putStrLn $ "Character by character: \n" putStrLn $ take 100 $ run 3 rawText 0 g putStrLn $ "\nWord by word: \n" putStrLn $ unwords $ take 100 $ run...