Last week the Apache Flink community announced the release of Apache Flink 1.9.0. The Flink community defines the project goal as “to develop a stream processing system to unify and power many forms of real-time and offline data processing applications as well as event-driven applications.”
In this release, they have made a huge step forward in that effort, by integrating Flink’s stream and batch processing capabilities under a single, unified runtime. There are significant features in this release, namely batch-style recovery for batch jobs and a preview of the new Blink-based query engine for Table API and SQL queries.
The team also announced the availability of the State Processor API, one of the most frequently requested features that enables users to read and write savepoints with Flink DataSet jobs. Additionally, Flink 1.9 includes a reworked WebUI and previews of Flink’s new Python Table API and it is integrated with the Apache Hive ecosystem.
Let us take a look at the major new features and improvements:
The time to recover a batch (DataSet, Table API and SQL) job from a task failure is significantly reduced. Until Flink 1.9, task failures in batch jobs were recovered by canceling all tasks and restarting the whole job, i.e, the job was started from scratch and all progress was voided. With this release, Flink can be configured to limit the recovery to only those tasks that are in the same failover region. A failover region is the set of tasks that are connected via pipelined data exchanges. Hence, the batch-shuffle connections of a job define the boundaries of its failover regions.
Up to Flink 1.9, accessing the state of a job from the outside was limited to the experimental Queryable State. In this release the team introduced a new, powerful library to read, write and modify state snapshots using the batch DataSet API. In practice, this means:
The new State Processor API covers all variations of snapshots: savepoints, full checkpoints and incremental checkpoints.
Cancelling with a savepoint is a common operation for stopping/restarting, forking or updating Flink jobs. However, the existing implementation did not guarantee output persistence to external storage systems for exactly-once sinks. To improve the end-to-end semantics when stopping a job, Flink 1.9 introduces a new SUSPEND mode to stop a job with a savepoint that is consistent with the emitted data. You can suspend a job with Flink’s CLI client as follows:
bin/flink stop -p [:targetDirectory] :jobId
The final job state is set to FINISHED on success, allowing users to detect failures of the requested operation.
After a discussion about modernizing the internals of Flink’s WebUI, this component was reconstructed using the latest stable version of Angular — basically, a bump from Angular 1.x to 7.x. The redesigned version is the default in Apache Flink 1.9.0, however there is a link to switch to the old WebUI.
After the donation of Blink to Apache Flink, the community worked on integrating Blink’s query optimizer and runtime for the Table API and SQL. The team refactored the monolithic flink-table module into smaller modules. This resulted in a clear separation of well-defined interfaces between the Java and Scala API modules and the optimizer and runtime modules.
The binary distribution and source artifacts for this release are now available via the Downloads page of the Flink project, along with the updated documentation. Flink 1.9 is API-compatible with previous 1.x releases for APIs annotated with the @Public annotation. You can review the release notes to know about the detailed list of changes and new features to upgrade Flink setup to Flink 1.9.0.
Apache Flink 1.8.0 releases with finalized state schema evolution support
Apache Flink founders data Artisans could transform stream processing with patent-pending tool
Apache Flink version 1.6.0 released!