Testing with intent
There are several angles to go about writing tests for code, and it is important to understand them before you start avoiding some of the bad practices. Tests written without a clear intent by the programmer are often characterized as being too long or asserting too much.
Asserting written code
The most important aspect of unit tests is to assert the code has the intended result when executed. It is important that the author of the tests is the same as that of the code, or some of the intent might be lost in the process.
The following is a code snippet:
// System Under Test let div x y = x / y // Test div 10 2 |> should equal 5
This might state the obvious, but a developer could easily mix up the order of incoming arguments:
// System Under Test let div y x = x / y // Test div 10 2 |> should equal 5
Running the test would expose the following error:
NUnit.Framework.AssertionException: But was: 0 Expected: 5
Tests give the developer a chance to state what is not obvious about the code but was still intended:
// System Under Test let div x y = x / y // Test div 5 2 |> should equal 2 (fun () -> div 5 0 |> ignore) |> should throw typeof<System.DivideByZeroException>
The test verifies that the remainder of the integer division is truncated, and that the code should throw an exception if you try to divide 5 by 0. These are behaviors that are implicit in the code but should be explicit in the tests.
Writing these assertions is often a faster way to verify that the code does what was intended than starting a debugger, entering the correct parameters, or opening up a web browser.
Contracts versus tests
There is a technique called Design by Contract (DbC) that was invented by Bertrand Meyer while designing the Eiffel programming language. The basic idea of DbC is that you create contracts on software components stating what the component expects from the caller, what it guarantees, and what it maintains.
This means that the software will verify the acceptable input values, protect them against side effects, and add preconditions and postconditions to the code at runtime.
The idea of software contracts is very attractive, a few attempts at implementing it for the .NET framework has had limited success. The heritage of DbC is defensive programming, which simply means the following:
- Checking input arguments for valid values
- Asserting the output values of functions
The idea behind this is that it is better to crash than to continue to run with a faulty state. If the input of the function is not acceptable, it is allowed to crash. The same is true if the function is not able to produce a result, at which time it will crash rather than return a faulty or temporary result:
let div x y = // precondition assert(y > 0) assert(x > y) let result = x / y // postcondition assert(result > 0) result
Assertions such as these cannot be seen as a replacement for testing. The differences are pretty clear. The contracts are validated at runtime when debugging the code, but deactivated when compiling the code for release. Tests are written outside the main code base and executed on demand.
With good assertions, you'll find more problems when doing manual testing, as the risk of running tests with faulty data is much smaller. You will also get code that is better at communicating its intent when all the functions have a clear definition of the preconditions and postconditions.
Designing code to be written
Testing your code is also an exercise in making it modular to enable it to be called from outside its original context. In doing so, you force the application to maintain an API in order for you to properly test it. It should be seen as a strength of the methodology that makes the code more concise and easier to read. It also enforces good patterns such as the single responsibility principle and dependency injection.
There is a reason for making use of test-driven development using the mantra red, green, refactor. The refactor part of testing is essential to create a successful test suite and application. You use a test to drive the design of your code, making it testable and achieving testability:
let rec crawl result url = // is duplicate if url exists in result let isDuplicate = result |> List.exists ((=) url) if isDuplicate then result else // create url let uri = new System.Uri(url) // create web client let client = new WebClient() // download html let html = client.DownloadString(url) // get all URL's let expression = new Regex(@"href=""(.*?)""") let captures = expression.Matches(html) |> Seq.cast<Match> |> Seq.map (fun m -> m.Groups.[1].Value) |> Seq.toList // join result with crawling all captured urls List.collect (fun c -> crawl (result @ (captures |> List.filter ((=) c))) c) captures
This program will get the contents of a URL, find all the links on the page, and crawl those links in order to find more URLs. This will happen until there are no more URLs to visit.
The code is hard to test because it does many things. If we extract functions, the code will be easier to test, have higher cohesion, and also be better in terms of the single responsibility principle.
The following code is an example of extracted functions:
// item exist in list -> true let isDuplicate result url = List.exists ((=) url) result // return html for url let getHtml url = (new WebClient()).DownloadString(new System.Uri(url)) // extract a-tag hrefs from html let getUrls html = Regex.Matches(html, @"href=""(.*?)""") |> Seq.cast<Match> |> Seq.map (fun m -> m.Groups.[1].Value) |> Seq.toList // return list except item let except item list = List.filter ((=) item) list // merge crawl of urls with result let merge crawl result urls = List.collect (fun url -> crawl (result @ (urls |> except url)) url) urls // crawl url unless already crawled it let rec crawl result url = if isDuplicate result url then result else (getHtml url) |> getUrls |> merge crawl result
The functionality is the same, but the code is much easier to test. Each individual part of the solution is now open for testing without causing side effects to the other parts.