Understanding migration approaches
Migrating from an on-premise environment to the AWS cloud provides several benefits including decoupling your compute and storage to provide independent scaling, better security with the AWS infrastructure, the flexibility to design pipelines by integrating other AWS analytics services, and saving resources that would be spent managing infrastructure and instead focusing on application development.
When you plan to migrate your on-premise Hadoop cluster to EMR, you need to analyze how your cluster will work in AWS, compare this with your on-premise environment, and then plan for the migration accordingly. The following are a few of the things you need to analyze:
- Which Hadoop ecosystem services are you using, and are they all supported in AWS?
- If the Hadoop services that are supported are available, then which EMR release version is the closest to your on-premise Hadoop version?
- Does your on-premise cluster use HDFS as a persistent...