Task 2 – Calculating the maximal length of a word in a stream
This is a similar example. In the previous task, we wanted to calculate the K most frequent words in a stream for a fixed time window. How would our solution change if our task was to calculate this from the beginning of the stream? Let's define the problem.
Defining the problem
Given an input data stream of lines of text, calculate the longest word ever seen in this stream. Start with an empty word value; once a longer word is seen, immediately output the new longest word.
Discussing the problem decomposition
Although the logic seems to be similar to the previous task, it can be simplified as follows:
Note, there are two main differences from the previous task:
- We must compute the word with the longest length; although this could be viewed as a
Top
transform, with K equal to one, Beam has a specific transform for that...