In this chapter, we learned about detailed architecture and uses of a few widely used Hadoop components, such as Apache Pig, Apache HBase, Apache Kafka, and Apache Flume. We have covered a few examples around writing custom UDF in Apache Pig, writing custom source, sink, and interceptor in Apache Flume, writing producer and consumer in Apache Kafka, and so on. Our focus was also to see how we can install and set up these componentS for practical use.Â
In the next chapter, our focus will be to look into some advanced topics in Big Data and cover some useful fundamental techniques and concepts, such as compression, file format serialization techniques, and some important pillars of data governance.Â