Streaming transformations
As in the previous word count example, we saw that words from each line were being counted once and for the next set of records the counter was reset again. But what if we want to add the previous state count to the new set of words in the following batch to come? Can we do that and how? The answer to the first part of the question is, in Spark Streaming there are two kinds of transformation, stateful and stateless transformation, so if we want to preserve the previous state then one will have to opt for stateful transformation rather than the stateless transformation that we achieved in the previous example.
Stream processing can be stateless or stateful based on the requirement. Some stream processing problems may require maintaining a state over a period of time, others may not.
Consider that an airline company wants to process data consistiting of the temperature reading of all active a flights at real time. If the airline wants to just print or store the reading...