EMR deployment options
As Amazon EMR is built on top of the open source Hadoop ecosystem, it tries to stay up to date with the open source stable releases, which includes new features and bug fixes.
Amazon EMR on Amazon EC2
Amazon EMR on Amazon EC2 is the first deployment option EMR offered and is very popular across different use cases. With EC2, you get the broadest range of instance types, which you can select depending on your workload and use case to get the best performance and cost benefits.
The following is a sample AWS CLI command that creates an Amazon EMR cluster with the emr-6.3.0
release label, five m5.xlarge
instances, and a Spark application:
$ aws emr create-cluster \ --name "First EMR on EC2 Cluster" \ --release-label emr-6.3.0 \ --applications Name=Spark \ --ec2-attributes KeyName=<myKeyPairName> \ --instance-type m5.xlarge \ --instance-count 5 \ --use-default-roles
Before executing the preceding command, please replace the <myKeyPairName...