Time for action – running WordCount on a local Hadoop cluster
Now we have generated the class files and collected them into a JAR file, we can run the application by performing the following steps:
Submit the new JAR file to Hadoop for execution.
$ hadoop jar wc1.jar WordCount1 test.txt output
If successful, you should see the output being very similar to the one we obtained when we ran the Hadoop-provided sample WordCount in the previous chapter. Check the output file; it should be as follows:
$ Hadoop fs –cat output/part-r-00000 This 1 yes 1 a 1 is 2 test 1 this 1
What just happened?
This is the first time we have used the Hadoop JAR command with our own code. There are four arguments:
The name of the JAR file.
The name of the driver class within the JAR file.
The location, on HDFS, of the input file (a relative reference to the
/user/Hadoop home
folder, in this case).The desired location of the output folder (again, a relative path).
Tip
The name of the driver class is only required if a main...