Spark on the Cloud – Amazon Elastic MapReduce
Finally, now that you have learnt about Spark, let's finally look at potentially limitless scaling! We will learn how to use cloud services to deploy Spark clusters. There are many big data and data analytic service providers, such as Google or IBM Bluemix, but we will concentrate on Amazon for this chapter. We will provide screenshots of the process because sometimes such platforms can get a little overwhelming. The following are the steps for the process:
- First, we need to create an Amazon Cloud account if you don't already have one. Go to https://aws.amazon.com and click on create a free account:
- Provide your credentials and click on
Create account
. - Next, we have to create a Key Pair. Key Pairs are the basic authentication method on Amazon.
Â
- First, we need to go the EC2 services dashboard:
- Then, click on
Key Pairs
in the side menu.
- Click on
Create Key Pair
and name ittest-spark
.
- Next, we need to give our user some special permissions, so on the...