Scaling cluster resources
When you launch an Amazon EMR cluster for big data processing, most of the time, the computing capacity you need for your jobs is different. The number of resources you need for your cluster depends on the data volume of the file size, the kind of processing logic you have, and whether your cluster resources are being shared by any other jobs.
There are a few cases where you have defined a data volume and you are able to do capacity planning to launch a fixed node cluster that does not need any scaling capacity. But in most cases, you will have a variable workload or a shared cluster for multiple workloads that needs to react to on-demand capacity needs, where you will need to scale your cluster capacity dynamically.
Amazon EMR provides flexibility to configure the scaling of cluster resources as it provides two scaling features, that is, EMR-managed scaling and autoscaling with a custom scaling policy. When considering automatic scaling of your cluster...