Using Directed Acyclic Word Graphs
We use Directed Acyclic Word Graphs (DAWG) to retrieve very quickly from a large corpus of strings at an extremely small cost in space complexity. Imagine compressing all words in a dictionary using a DAWG to perform efficient lookups for words. It is a powerful data structure that can come in handy when dealing with a large corpus of words. A very nice introduction to DAWGs can be found in Steve Hanov's blog post here: http://stevehanov.ca/blog/index.php?id=115.
We can use this recipe to incorporate a DAWG in our code.
Getting ready
Install the DAWG package using cabal:
$ cabal install dawg
How to do it...
We name a new file Main.hs
and insert the following code:
Import the following packages:
import qualified Data.DAWG.Static as D import Network.HTTP ( simpleHTTP, getRequest, getResponseBody) import Data.Char (toLower, isAlphaNum, isSpace) import Data.Maybe (isJust)
In
main
, download a large corpus of text to store:main = do let url...