Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hadoop Cluster Deployment

You're reading from   Hadoop Cluster Deployment Construct a modern Hadoop data platform effortlessly and gain insights into how to manage clusters efficiently

Arrow left icon
Product type Paperback
Published in Nov 2013
Publisher Packt
ISBN-13 9781783281718
Length 126 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Danil Zburvisky Danil Zburvisky
Author Profile Icon Danil Zburvisky
Danil Zburvisky
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Choosing OS for the Hadoop cluster


Choosing an operating system for your future Hadoop cluster is a relatively simple task. Hadoop core and its ecosystem components are all written in Java, with a few exceptions. While Java code itself is cross-platform, currently Hadoop only runs on Linux-like systems. The reason for this is that too many design decisions were made with Linux in mind, which made the code surrounding core Hadoop components such as start/stop scripts and permissions model dependent on the Linux environment.

When it comes to Linux, Hadoop is pretty indifferent to specific implementations and runs well on different varieties of this OS: Red Hat, CentOS, Debian, Ubuntu, Suse, and Fedora. All these distributions don't have specific requirements for running Hadoop. In general, nothing prevents Hadoop from successfully working on any other POSIX-style OS, such as Solaris or BSD, if you make sure that all dependencies are resolved properly and all shell supporting scripts are working. Still, most of the production installations of Hadoop are running on Linux and this is the OS that we will be focusing on in our further discussions. Specifically, examples in this book will be focused on CentOS, since it is one of the popular choices for the production system, as well as its twin, Red Hat.

Apache Hadoop provides source binaries, as well as RPM and DEB packages for stable releases. Currently, this is a 1.0 branch. Building Hadoop from the source code, while still being an option, is not recommended for most of the users, since it requires experience in assembling big Java-based projects and careful dependencies resolution. Both Cloudera and Hortonworks distributions provide an easy way to setup a repository on your servers and install all required packages from there.

Tip

There is no strict requirement to run the same operating system across all Hadoop nodes, but common sense suggests, that the lesser the deviation in nodes configuration, the easier it is to administer and manage it.

You have been reading a chapter from
Hadoop Cluster Deployment
Published in: Nov 2013
Publisher: Packt
ISBN-13: 9781783281718
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image