Introduction to streaming execution model
Flink is an open source framework for distributed stream processing that:
- Provides results that are accurate, even in the case of out-of-order or late-arriving data
- Is stateful and fault tolerant, and can seamlessly recover from failures while maintaining an exactly-once application state
- Performs on a large scale, running on thousands of nodes with very good throughput and latency characteristics
The following diagram is a generalized view of stream processing:
Many of Flink's features - state management, handling out-of-order data, flexible windowing – are essential for computing accurate results on unbounded datasets and are enabled by Flink's streaming execution model:
- Flink guarantees exactly-once semantics for stateful computations. Stateful means that applications can maintain an aggregation or summary of data that has been processed over time, and Flink's checkpointing mechanism ensures exactly-once semantics for an application's state in the event...