The problem – grouping continuous integers
A while back, I wanted a simple and elegant way to solve the following problem:
Given: A sorted list of numbers
When: We group these numbers
Then: Each number in a group is higher by one than its predecessor. So, let's be good and write a few tests first:
We will make use of the excellent Hamcrest matchers (and the excellent Guava library) to help us express succinctly what we want our code to do.
You know the drill! Let's roll up our sleeves and dish out some code. The following looks pretty good:
This code has a lot of subtlety. We use the Apache commons 3 validation API for asserting the invariants. Download this book's source code to check out the unit test.
We are using the excellent Guava library to work with lists. Note the ease of this library, with which we can create and populate the list in one line. Refer to https://code.google.com/p/guava-libraries/wiki/CollectionUtilitiesExplained for more details.
We are careful to first make the input list an unmodifiable list—to protect ourselves from stepping on our toes... We also need to carefully arrange the iteration to make sure that it eventually terminates and we do not accidentally access a non-existent list index (try to say it without gulping).
Remember the sermon on being abstract? Our Java code is dealing with a lot of higher level and lower level details—all at the same time... We really don't want to do that; we wish to selectively ignore the details... We really don't want to deal with all the corner cases of iteration—the Java code is worried stiff about looking at two consecutive elements and the corner cases.
What if this was somehow handled for us so we can focus on the business at hand? Let's take a look at the following example:
This version looks a lot better—doesn't it? We use features such as nested functions and recursion and get stuff such as immutability for free... A list in Scala is immutable.In Java, we need to work a bit more to make things immutable. In Scala, it is the other way around. You need to go a little further to bring in mutability... We also use something called persistent data structures that have nothing to do with databases here. We will look at it in detail in a while...
However, a bit better version follows—meaning a tail-recursive version. Take a look at the following example of it:
The @tailrec
function does some good to our code. This is an annotation that we put on a method. We use it to make sure that the method will be compiled with tail call optimization (TCO). TCO converts the recursive form into a loop.
If the Scala compiler cannot apply TCO, it flags an error. Recursion and TCO are covered in detail in Chapter 3, Recursion and Chasing Your Own Tail.
Now, having set the stage—let's look at the reusability feature...
Reusability – the commonality/variability analysis
Instead of a fixed grouping criteria, could we make the Java version more dynamic? Instead of the expressing the condition directly in the code, we could wrap up the condition as a predicate:
A predicate is a function that returns a Boolean value. It should take two numbers and tell whether the predicate is true
or false
?
Now there could be multiple predicates. For example, we might want to find elements that differ by 2 or 3 numbers. We can define an interface as follows:
Why would we do this? We would do this for two reasons:
- We would want to reuse the algorithm correctly and pick up two consecutive elements. This algorithm would handle all corner cases as in the previous cases.
- The actual grouping criteria in our case is
nextElem – currElem == 1
, but we can reuse the previous code to regroup the numbers using another criteria.
Here is how grouping would look; I am just showing the changed code:
It is some work to define and use the predicate. Scala makes expressing this variability pretty easy, as we can use functions. Just pass on your criteria as a function and you are good to go:
In 1
, we use the function - (f: (Int, Int) => Boolean)
.
Note
The f: (Int, Int) => Boolean
syntax means that f
is a function parameter. The function takes two Int
params and returns Boolean.
Here is a quick REPL session to understand this better:
We are using the t
feature of the REPL to look at the type of x
.
This is a function—a bit of code we can pass around—that corresponds to the previous Java predicate snippet. It is a function that takes two integers and returns a Boolean.
(A function is in fact a class that mixes in a trait, function2
in our case).
At 2, we use it. Now comes the fun part.
At 3, we hold on to everything else except the function. And at 4 and 5, we just pass different functions. It would be more correct to say that we use function literals. lo and behold —we get the answers right.
This code also shows some currying and partially applied functions in action… We cover both of these features in Chapter 6, Currying Favors with Your code.
Have you heard this phrase anytime: it is just a one-liner! You see all the work done in a single line of code, it seems almost magical. The Unix shell thrives on these one-liners. For example:
Phew! This single command line generates a sequence of numbers, 1 to 100, generates a multiplication expression from these, and feeds the same to bca calculator that does the actual multiplication.
Scala has a legion of these. Here is one applied to our problem by using scalaz library.
Note
To try the following snippet, you need to install the scalaz library. It is not part of the Scala standard library. Here are a few simple instructions to install it.
Create a directory of your choice and switch to it. Next, download and install the Simple Build Tool (sbt) from http://www.scala-sbt.org/download.html.
Now the following snippet should work after installing the scalaz library:
Using functions the solution is just a one-liner:
So we are just not writing any of the supporting code—just stating our criteria for grouping.
You can read more about Scalaz at http://eed3si9n.com/learning-scalaz/.