Introduction to Apache Flink
Flink is an open source framework for distributed stream processing, and has the following features:
- It provides results that are accurate, even in the case of out-of-order or late-arriving data
- It is stateful and fault tolerant, and can seamlessly recover from failures while maintaining an exactly-once application state
- It performs at a large scale, running on thousands of nodes with very good throughput and latency characteristics
The following is a screenshot from the official documentation that shows how Apache Flink can be used:
Another way of viewing the Apache Flink framework is shown in the following screenshot:
All Flink programs are executed lazily, when the program’s main method is executed, the data loading and transformations do not happen directly. Rather, each operation is created and added to the program’s plan. The operations are actually executed when the execution is explicitly triggered by an execute()
call on the execution environment. Whether...