Summary
This has been quite a whistle-stop tour around the considerations of running an operational Hadoop cluster. We didn't try to turn developers into administrators, but hopefully, the broader perspective will help you to help your operations staff. In particular, we covered the following topics:
How Hadoop is a natural fit for DevOps approaches as its multilayered complexity means it's not possible or desirable to have substantial knowledge gaps between development and operations staff
Cloudera Manager, and how it can be a great management and monitoring tool; it might cause integration problems though, if you have other enterprise tools, and it comes with a vendor lock-in risk
Ambari, the Apache open source alternative to Cloudera Manager, and how it is used in the Hortonworks distribution
How to think about selecting hardware for a physical Hadoop cluster, and how this naturally fits into the considerations of how the multiple workloads possible in the world of Hadoop 2 can peacefully...