Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Haskell High Performance Programming

You're reading from   Haskell High Performance Programming Write Haskell programs that are robust and fast enough to stand up to the needs of today

Arrow left icon
Product type Paperback
Published in Sep 2016
Publisher Packt
ISBN-13 9781786464217
Length 408 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Samuli Thomasson Samuli Thomasson
Author Profile Icon Samuli Thomasson
Samuli Thomasson
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Identifying Bottlenecks FREE CHAPTER 2. Choosing the Correct Data Structures 3. Profile and Benchmark to Your Heart's Content 4. The Devil's in the Detail 5. Parallelize for Performance 6. I/O and Streaming 7. Concurrency and Performance 8. Tweaking the Compiler and Runtime System (GHC) 9. GHC Internals and Code Generation 10. Foreign Function Interface 11. Programming for the GPU with Accelerate 12. Scaling to the Cloud with Cloud Haskell 13. Functional Reactive Programming 14. Library Recommendations Index

Recursion and accumulators

Recursion is perhaps the most important pattern in functional programming. Recursive functions are more practical in Haskell than in imperative languages, due to referential transparency and laziness. Referential transparency allows the compiler to optimize the recursion away into a tight inner loop, and laziness means that we don't have to evaluate the whole recursive expression at once.

Next we will look at a few useful idioms related to recursive definitions: the worker/wrapper transformation, guarded recursion, and keeping accumulator parameters strict.

The worker/wrapper idiom

Worker/wrapper transformation is an optimization that GHC sometimes does, but worker/wrapper is also a useful coding idiom. The idiom consists of a (locally defined, tail-recursive) worker function and a (top-level) function that calls the worker. As an example, consider the following naive primality test implementation:

-- file: worker_wrapper.hs

isPrime :: Int -> Bool
isPrime n
    | n <= 1    = False
    | n <= 3    = True
    | otherwise = worker 2
       where
         worker i | i >= n       = True
                  | mod n i == 0 = False
                  | otherwise    = worker (i+1)

Here, isPrime is the wrapper and worker is the worker function. This style has two benefits. First, you can rest assured it will compile into optimal code. Second, the worker/wrapper style is both concise and flexible; notice how we did preliminary checks in the wrapper code before invoking the worker, and how the argument n is also (conveniently) in the worker's scope too.

Guarded recursion

In strict languages, tail-call optimization is often a concern with recursive functions. A function f is tail-recursive if the result of a recursive call to f is the result. In a lazy language such as Haskell, tail-call "optimization" is guaranteed by the evaluation schema. Actually, because in Haskell evaluation is normally done only up to WHNF (outmost data constructor), we have something more general than just tail-calls, called guarded recursion. Consider this simple moving average implementation:

-- file: sma.hs
sma :: [Double] -> [Double]
sma (x0:x1:xs) = (x0 + x1) / 2 : sma (x1:xs)
sma         xs = xs

The sma function is not tail-recursive, but nonetheless it won't build up a huge stack like an equivalent in some other language might do. In sma, the recursive callis guarded by the (:) data constructor. Evaluating the first element of a call to sma does not yet make a single recursive call to sma. Asking for the second element initiates the first recursive call, the third the second, and so on.

As a more involved example, let's build a reverse polish notation (RPN) calculator. RPN is a notation where operands precede their operator, so that (3 1 2 + *) in RPN corresponds to ((3 + 1) * 2), for example. To make our program easier to understand, we wish to separate parsing the input from performing the calculation:

-- file: rpn.hs
data Lex = Number Double Lex
         | Plus Lex
         | Times Lex
         | End

lexRPN :: String -> Lex
lexRPN = go . words
  where go ("*":rest) = Times (go rest)
        go ("+":rest) = Plus (go rest)
        go (num:rest) = Number (read num) (go rest)
        go         [] = End

The Lex datatype represents a formula in RPN and is similar to the standard list type. The lexRPN function reads a formula from string format into our own datatype. Let's add an evalRPN function, which evaluates a parsed RPN formula:

evalRPN :: Lex -> Double
evalRPN = go []
  where
    go stack (Number num rest)
       = go (num : stack) rest
    go (o1:o2:stack) (Plus rest)
       = let r = o1 + o2 in r `seq` go (r : stack) rest
    go (o1:o2:stack) (Times rest)
       = let r = o1 * o2 in r `seq` go (r : stack) rest
    go [res] End
       = res

We can test this implementation to confirm that it works:

> :load rpn.hs
> evalRPN $ lexRPN "5 1 2 + 4 * *"
60.0

The RPN expression (5 1 2 + 4 * *) is (5 * ((1 + 2) * 4)) in infix, which is indeed equal to 60.

Note how the lexRPN function makes use of guarded recursion when producing the intermediate structure. It reads the input string incrementally and yields the structure an element at a time. The evaluation function evalRPN consumes the intermediate structure from left to right and is tail-recursive, so we keep the minimum amount of things in memory at all times.

Note

Linked lists equipped with guarded recursion (and lazy I/O) actually provide a lightweight streaming facility – for more on streaming see Chapter 6, I/O and Streaming.

Accumulator parameters

In our examples so far, we have encountered a few functions that used some kind of accumulator. mySum2 had an Int that increased on every step. The go worker function in evalRPN passed on a stack (a linked list). The former had a space leak, because we didn't require the accumulator's value until at the end, at which point it had grown into a huge chain of pointers. The latter case was okay because the stack didn't grow in size indefinitely and the parameter was sufficiently strict in the sense that we didn't unnecessarily defer its evaluation. The fix we applied in mySum2' was to force the accumulator to WHNF at every iteration, even though the result was not strictly speaking required in that iteration.

The final lesson is that you should apply special care to your accumulator's strictness properties. If the accumulator must always be fully evaluated in order to continue to the next step, then you're automatically safe. But if there is a danger of an unnecessary chain of thunks being constructed due to a lazy accumulator, then adding a seq (or a bang pattern, see Chapter 2, Choose the Correct Data Structures) is more than just a good idea.

You have been reading a chapter from
Haskell High Performance Programming
Published in: Sep 2016
Publisher: Packt
ISBN-13: 9781786464217
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image