Network design
In this recipe, we will be looking at the network design for the Hadoop cluster and what things to consider for planning a Hadoop cluster.
Getting ready
Make sure that the user has a running cluster with HDFS and YARN and has at least two nodes in the cluster.
How to do it...
- Connect to the
master1.cyrus.com
Namenode and switch to the userhadoop
. - Execute the commands as follows to check for the link speed and other network option modes:
$ ethtool eth0 $ iftop $ netstat -s
- Always have a separate network for Hadoop traffic by using VLANs.
- Ensure the DNS resolution works for both forward and reverse lookup.
- Run a caching-only DNS within the Hadoop network, which caches records for faster resolution.
- Consider NIC teaming or binding for better performance.
- Use dedicated core switches and rack top switches.
- Consider having static IPs per node in the cluster.
- Disable IPv6 for all nodes and just use IPv4.
- Increasing the size of the cluster will mean more connections and more data across nodes...