In this section, we will focus on setting up a cluster of Hadoop. We will also go over other important aspects of a Hadoop cluster, such as sizing guidelines, setup instructions, and so on. A Hadoop cluster can be set up with Apache Ambari, which offers a much simpler, semi-automated, and error-prone configuration of a cluster. However, the latest version of Ambari at the time of writing supports older Hadoop versions. To set up Hadoop 3.1, we must do so manually. By the time this book is out, you may be able to use a much simpler installation process. You can read about older Hadoop installations in the Ambari installation guide, available here.
Before you set up a Hadoop cluster, it would be good to check the sizing of a cluster so that you can plan better, and avoid reinstallation due to incorrectly estimated cluster size. Please refer to the...