Time for action β the first run
Let's now perform the initial execution of this algorithm on our starting representation of the graph:
Put the previously created
graph.txt
file onto HDFS:$ hadoop fs -mkdirgraphin $ hadoop fs -put graph.txtgraphin/graph.txt
Compile the job and create the JAR file:
$ javac GraphPath.java $ jar -cvf graph.jar *.class
Execute the MapReduce job:
$ hadoop jar graph.jarGraphPathgraphingraphout1
Examine the output file:
$ hadoop fs βcat /home/user/hadoop/graphout1/part-r00000 12,3,40D 21,41C 31,5,61C 41,21C 53,6-1P 63,5-1P 76-1P
What just happened?
After putting the source file onto HDFS and creating the job JAR file, we executed the job in Hadoop. The output representation of the graph shows a few changes, as follows:
Node 1 is now marked as Done; its distance from itself is obviously 0
Nodes 2, 3, and 4 β the neighbors of node 1 β are marked as Currently processing
All other nodes are Pending
Our graph now looks like the following figure:
Given the algorithm, this...