Using lazy programming for pipelining
Lazy programming is a strategy where we defer computation until it’s really needed. It has many advantages compared with its counterpart, eager programming, where we compute everything as soon as we invoke a computation.
Python provides many mechanisms for lazy programming – indeed, one of the biggest changes from Python 2 to Python 3 is that the language became lazier.
To understand lazy programming, we are going again to take our gene database and do an exercise with it. We are going to check whether we have at least n genes with y reads each (for example, three genes with five reads each). This can be, say, a measure of the quality of our database – that is, a measure of whether we have enough genes with a certain number of samples.
We are going to consider two implementations: one lazy and one eager. We will then compare the implications of both approaches.
Getting ready
The code for this recipe can be found...