In this chapter, we have studied the logical view of Hadoop in the cloud and how the logical architecture for Hadoop would look over the cloud. We also learned about managing resources, which is a continuous process for either on-premise infrastructure or infrastructure on the cloud. We got introduced to the data pipelines and how we can use cloud providers' tools all together to build a data pipeline for customers. This chapter also focuses on High Availability, which is a primary focus for all the frameworks and applications available today. The application can be deployed either on-premise or over the cloud.Â
In the next chapter, we will study Hadoop Cluster Profiling.Â