You're reading from Scaling Big Data with Hadoop and Solr, Second Edition Understand, design, build, and optimize your big data search engine with Hadoop and Apache Solr

Product type Paperback

Published in Apr 2015

Publisher

ISBN-13 9781783553396

Length 166 pages

Edition 1st Edition

Tools

Solr

Concepts

Big Data

Author (1):

Hrishikesh Vijay Karambelkar

View More author details

Table of Contents (8) Chapters

Preface

1. Processing Big Data Using Hadoop and MapReduce

2. Understanding Apache Solr FREE CHAPTER

3. Enabling Distributed Search using Apache Solr

4. Big Data Search Using Hadoop and Its Ecosystem

5. Scaling Search Performance

A. Use Cases for Big Data Search

Index

Setting up a Hadoop cluster

In this case, assuming that you already have a single node setup as explained in the previous sections, with ssh being enabled, you just need to change all the slave configurations to point to the master. This can be achieved by first introducing the slaves file in the $HADOOP_PREFIX/etc/Hadoop folder. Similarly, on all slaves, you require the master file in the $HADOOP_PREFIX/etc/Hadoop folder to point to your master server hostname.

Note

While adding new entries for the hostname, one must ensure that the firewall is disabled to allow remote nodes access to different ports. Alternatively, specific ports can be opened/modified by modifying the Hadoop configuration files. Similarly, all the names of nodes that are participating in the cluster should be resolvable through DNS (which stands for Domain Name System), or through the /etc/host entries of Linux.

Once this is ready, let us change the configuration files. Open core-sites.xml, and add the following entry in it:

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://master-server:9000</value>
  </property>
</configuration>

All other configuration is optional. Now, run the servers in the following order: First, you need to format your storage for the cluster; use the following command to do so:

$ $HADOOP_PREFIX/bin/Hadoop dfs namenode -format <Name of Cluster>

This formats the name node for a new cluster. Once the name node is formatted, the next step is to ensure that DFS is up and connected to each node. Start namenode, followed by the data nodes:

$ $HADOOP_PREFIX/sbin/Hadoop-daemon.sh start namenode

Similarly, the datanode can be started from all the slaves.

$ $HADOOP_PREFIX/sbin/Hadoop-daemon.sh start datanode

Keep track of the log files in the $HADOOP_PREFIX/logs folder in order to see that there are no exceptions. Once the HDFS is available, namenode can be accessed through the web as shown here:

The next step is to start YARN and its associated applications. First, start with the RM:

$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start resourcemanager

Each node must run an instance of one node manager. To run the node manager, use the following command:

$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start nodemanager

Optionally, you can also run Job History Server on the Hadoop cluster by using the following command:

$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver

Once all instances are up, you can see the status of the cluster on the web through the RM UI as shown in the following screenshot. The complete setup can be tested by running the simple wordcount example.

This way, your cluster is set up and is ready to run with multiple nodes. For advanced setup instructions, do visit the Apache Hadoop website at http://Hadoop.apache.org.

The rest of the chapter is locked

You're reading from Scaling Big Data with Hadoop and Solr, Second Edition Understand, design, build, and optimize your big data search engine with Hadoop and Apache Solr

Table of Contents (8) Chapters

Setting up a Hadoop cluster

Note

Authors (1)

Personalised recommendations for you

You're reading from Scaling Big Data with Hadoop and Solr, Second Edition Understand, design, build, and optimize your big data search engine with Hadoop and Apache Solr

Table of Contents (8) Chapters

Setting up a Hadoop cluster

Note

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you