Time for action – running WordCount on EMR
We will now show you how to run this same JAR file on EMR. Remember, as always, that this costs money!
Go to the AWS console at http://aws.amazon.com/console, sign in, and select S3.
You'll need two buckets: one to hold the JAR file and another for the job output. You can use existing buckets or create new ones.
Open the bucket where you will store the job file, click on Upload, and add the
wc1.jar
file created earlier.Return to the main console home page, and then go to the EMR portion of the console by selecting Elastic MapReduce.
Click on the Create a New Job Flow button and you'll see a familiar screen as shown in the following screenshot:
Previously, we used a sample application; to run our code, we need to perform different steps. Firstly, select the Run your own application radio button.
In the Select a Job Type combobox, select Custom JAR.
Click on the Continue button and you'll see a new form, as shown in the following screenshot:
We now specify...