Accessing Apache Hadoop from Karaf
In Hadoop, the core of a cluster is the distributed and replicated filesystem. We have HDFS running and can access it from our command line window as a regular user. Actually, getting to it from an OSGi container will prove to be slightly more complicated than just writing the Java components.
Hadoop requires us to provide configuration metadata for our cluster that can be looked up as file or classpath resources. In this recipe, we will simply copy the HDFS site-specific files we created earlier in the chapter to our src/main/resources
folder.
We will also include the default metadata definitions into our resources by copying them from a dependency, and finally, we'll allow our bundle classloader to perform fully dynamic class loading. To sum it up, we have to copy the core-site.xml
, hdfs-site.xml
, and mapred-site.xml
files into our classpath. These files together describe to our client how to access HDFS.
As we get to the code, there is also a step...