Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Haskell Data Analysis cookbook

You're reading from   Haskell Data Analysis cookbook Explore intuitive data analysis techniques and powerful machine learning methods using over 130 practical recipes

Arrow left icon
Product type Paperback
Published in Jun 2014
Publisher
ISBN-13 9781783286331
Length 334 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Nishant Shukla Nishant Shukla
Author Profile Icon Nishant Shukla
Nishant Shukla
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. The Hunt for Data FREE CHAPTER 2. Integrity and Inspection 3. The Science of Words 4. Data Hashing 5. The Dance with Trees 6. Graph Fundamentals 7. Statistics and Analysis 8. Clustering and Classification 9. Parallel and Concurrent Design 10. Real-time Data 11. Visualizing Data 12. Exporting and Presenting Index

Examining a JSON file with the aeson package

JavaScript Object Notation (JSON) is a way to represent key-value pairs in plain text. The format is described extensively in RFC 4627 (http://www.ietf.org/rfc/rfc4627).

In this recipe, we will parse a JSON description about a person. We often encounter JSON in APIs from web applications.

Getting ready

Install the aeson library from hackage using Cabal.

Prepare an input.json file representing data about a mathematician, such as the one in the following code snippet:

$ cat input.json

{"name":"Gauss", "nationality":"German", "born":1777, "died":1855}

We will be parsing this JSON and representing it as a usable data type in Haskell.

How to do it...

  1. Use the OverloadedStrings language extension to represent strings as ByteString, as shown in the following line of code:
    {-# LANGUAGE OverloadedStrings #-}
  2. Import aeson as well as some helper functions as follows:
    import Data.Aeson
    import Control.Applicative
    import qualified Data.ByteString.Lazy as B
  3. Create the data type corresponding to the JSON structure, as shown in the following code:
    data Mathematician = Mathematician 
                         { name :: String
                         , nationality :: String
                         , born :: Int
                         , died :: Maybe Int
                         } 
  4. Provide an instance for the parseJSON function, as shown in the following code snippet:
    instance FromJSON Mathematician where
      parseJSON (Object v) = Mathematician
                             <$> (v .: "name")
                             <*> (v .: "nationality")
                             <*> (v .: "born")
                             <*> (v .:? "died")
  5. Define and implement main as follows:
    main :: IO ()
    main = do
  6. Read the input and decode the JSON, as shown in the following code snippet:
      input <- B.readFile "input.json"
    
      let mm = decode input :: Maybe Mathematician
    
      case mm of
        Nothing -> print "error parsing JSON"
        Just m -> (putStrLn.greet) m
  7. Now we will do something interesting with the data as follows:
    greet m = (show.name) m ++ 
              " was born in the year " ++ 
              (show.born) m
  8. We can run the code to see the following output:
    $ runhaskell Main.hs
    
    "Gauss" was born in the year 1777
    

How it works...

Aeson takes care of the complications in representing JSON. It creates native usable data out of a structured text. In this recipe, we use the .: and .:? functions provided by the Data.Aeson module.

As the Aeson package uses ByteStrings instead of Strings, it is very helpful to tell the compiler that characters between quotation marks should be treated as the proper data type. This is done in the first line of the code which invokes the OverloadedStrings language extension.

Tip

Language extensions such as OverloadedStrings are currently supported only by the Glasgow Haskell Compiler (GHC).

We use the decode function provided by Aeson to transform a string into a data type. It has the type FromJSON a => B.ByteString -> Maybe a. Our Mathematician data type must implement an instance of the FromJSON typeclass to properly use this function. Fortunately, the only required function for implementing FromJSON is parseJSON. The syntax used in this recipe for implementing parseJSON is a little strange, but this is because we're leveraging applicative functions and lenses, which are more advanced Haskell topics.

The .: function has two arguments, Object and Text, and returns a Parser a data type. As per the documentation, it retrieves the value associated with the given key of an object. This function is used if the key and the value exist in the JSON document. The :? function also retrieves the associated value from the given key of an object, but the existence of the key and value are not mandatory. So, we use .:? for optional key value pairs in a JSON document.

There's more…

If the implementation of the FromJSON typeclass is too involved, we can easily let GHC automatically fill it out using the DeriveGeneric language extension. The following is a simpler rewrite of the code:

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DeriveGeneric #-}
import Data.Aeson
import qualified Data.ByteString.Lazy as B
import GHC.Generics

data Mathematician = Mathematician { name :: String
                                   , nationality :: String
                                   , born :: Int
                                   , died :: Maybe Int
                                   } deriving Generic

instance FromJSON Mathematician

main = do
  input <- B.readFile "input.json"
  let mm = decode input :: Maybe Mathematician
  case mm of
    Nothing -> print "error parsing JSON"
    Just m -> (putStrLn.greet) m
    
greet m = (show.name) m ++" was born in the year "++ (show.born) m

Although Aeson is powerful and generalizable, it may be an overkill for some simple JSON interactions. Alternatively, if we wish to use a very minimal JSON parser and printer, we can use Yocto, which can be downloaded from http://hackage.haskell.org/package/yocto.

You have been reading a chapter from
Haskell Data Analysis cookbook
Published in: Jun 2014
Publisher:
ISBN-13: 9781783286331
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime