Understanding windowing semantics in depth
In Chapter 1, Introducing Data Processing with Apache Beam, we introduced the basic types of window functions. To recap, we defined the following:
- Fixed windows
- Sliding windows
- Global window
- Session windows
We also defined two basic types of windows: key-aligned and key-unaligned. The first three types (fixed, sliding, and global) are key-aligned, and session windows are key-unaligned (as in session windows, each window can start and end at different times for different keys). However, what we skipped in Chapter 1, Introduction to Data Processing with Apache Beam, was the fact that we can define completely custom windowing logic.
The Window.into
transform accepts a generic WindowFn
instance, which defines the following main methods:
- The
assignWindows
method, which assigns elements into a set of window labels. - The
isNonMerging
method, which tells the runner whether theWindowFn
instance defines merging...