Calculating a moving median
The median of a list of numbers has an equal number of values less than and greater than it. The naive approach of calculating the median is to simply sort the list and pick the middle number. However, on a very large dataset, such a computation would be inefficient.
Another approach of finding a moving median is to use a combination of a minheap and a maxheap to sort the values while running through the data. We can insert numbers in either heap as they are seen, and whenever needed, the median can be calculated by adjusting the heaps to be of equal or near equal size. When the heaps are of equal size, it is simple to find the middle number, which is the median.
Getting ready
Create a file, input.txt
, with some numbers:
$ cat input.txt 3 4 2 5 6 4 2 6 4 1
Also, install a library for dealing with heaps using Cabal as follows:
$ cabal install heap
How to do it…
Import the heap data structure:
import Data.Heap import Data.Maybe (fromJust)
Convert the raw input as a list...