Event stream processing can happen in a number of ways on AWS. One of the ways—Kinesis—we will cover separately. We will discuss using Elastic MapReduce here.
Elastic MapReduce is capable of using several different types of processing engines: Hadoop, Apache Spark, HBase, Presto, and Apache Flink. All of these engines are used for processing large amounts of data: some, such as Spark and Flink, are for real-time data processing, and Hadoop is for batch processing and so much more.
Standard EMR has already been covered in a previous chapter.
Using Flink (or another engine) via Terraform is as easy as changing the application's array content, as shown in the following block:
resource "aws_emr_cluster" "cluster" {
name = "emr-test-arn"
release_label = "emr-4.6.0"
applications = ["Flink"...