Installing and configuring Elasticsearch
I have used the Elasticsearch Version 2.0.0 in this book; you can choose to install other versions, if you wish to. You just need to replace the version number with 2.0.0. You need to have an administrative account to perform the installations and configurations.
Installing Elasticsearch on Ubuntu through Debian package
Let's get started with installing Elasticsearch on Ubuntu Linux. The steps will be the same for all Ubuntu versions:
- Download the Elasticsearch Version 2.0.0 Debian package:
wget https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-2.0.0.deb
- Install Elasticsearch, as follows:
sudo dpkg -i elasticsearch-2.0.0.deb
- To run Elasticsearch as a service (to ensure Elasticsearch starts automatically when the system is booted), do the following:
sudo update-rc.d elasticsearch defaults 95 10
Installing Elasticsearch on Centos through the RPM package
Follow these steps to install Elasticsearch on Centos machines. If you are using any other Red Hat Linux distribution, you can use the same commands, as follows:
- Download the Elasticsearch Version 2.0.0 RPM package:
wget https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-2.0.0.rpm
- Install Elasticsearch, using this command:
sudo rpm -i elasticsearch-2.0.0.rpm
- To run Elasticsearch as a service (to ensure Elasticsearch starts automatically when the system is booted), use the following:
sudo systemctl daemon-reload sudo systemctl enable elasticsearch.service
Understanding the Elasticsearch installation directory layout
The following table shows the directory layout of Elasticsearch that is created after installation. These directories, have some minor differences in paths depending upon the Linux distribution you are using.
Description |
Path on Debian/Ubuntu |
Path on RHEL/Centos |
---|---|---|
Elasticsearch home directory |
|
|
Elasticsearch and Lucene jar files |
|
|
Contains plugins |
|
|
The locations of the binary scripts that are used to start an ES node and download plugins |
|
|
Contains the Elasticsearch configuration files: ( |
|
|
Contains the data files of the index/shard allocated on that node |
|
|
The startup script for Elasticsearch (contains environment variables including HEAP SIZE and file descriptors) |
|
Or |
Contains the log files of Elasticsearch. |
|
|
During installation, a user and a group with the elasticsearch
name are created by default. Elasticsearch does not get started automatically just after installation. It is prevented from an automatic startup to avoid a connection to an already running node with the same cluster name.
Tip
It is recommended to change the cluster name before starting Elasticsearch for the first time.
Configuring basic parameters
- Open the
elasticsearch.yml
file, which contains most of the Elasticsearch configuration options:sudo vim /etc/elasticsearch/elasticsearch.yml
- Now, edit the following ones:
cluster.name
: The name of your clusternode.name
: The name of the nodepath.data
: The path where the data for the ES will be stored
Note
Similar to
path.data
, we can changepath.logs
andpath.plugins
as well. Make sure all these parameters values are inside double quotes. - After saving the
elasticsearch.yml
file, start Elasticsearch:sudo service elasticsearch start
Elasticsearch will start on two ports, as follows:
- 9200: This is used to create HTTP connections
- 9300: This is used to create a TCP connection through a JAVA client and the node's interconnection inside a cluster
Tip
Do not forget to uncomment the lines you have edited. Please note that if you are using a new data path instead of the default one, then you first need to change the owner and the group of that data path to the user, elasticsearch.
The command to change the directory ownership to elasticsearch is as follows:
sudo chown –R elasticsearch:elasticsearch data_directory_path
- Run the following command to check whether Elasticsearch has been started properly:
sudo service elasticsearch status
If the output of the preceding command is shown as elasticsearch is not running, then there must be some configuration issue. You can open the log file and see what is causing the error.
The list of possible issues that might prevent Elasticsearch from starting is:
- A Java issue, as discussed previously
- Indention issues in the
elasticsearch.yml
file - At least 1 GB of RAM is not free to be used by Elasticsearch
- The ownership of the data directory path is not changed to elasticsearch
- Something is already running on port 9200 or 9300
Adding another node to the cluster
Adding another node in a cluster is very simple. You just need to follow all the steps for installation on another system to install a new instance of Elasticsearch. However, keep the following in mind:
- In the
elasticsearch.yml
file,cluster.name
is set to be the same on both the nodes - Both the systems should be reachable from each other over the network.
- There is no firewall rule set for Elasticsearch port blocking
- The Elasticsearch and JAVA versions are the same on both the nodes
You can optionally set the network.host
parameter to the IP address of the system to which you want Elasticsearch to be bound and the other nodes to communicate.
Installing Elasticsearch plugins
Plugins provide extra functionalities in a customized manner. They can be used to query, monitor, and manage tasks. Thanks to the wide Elasticsearch community, there are several easy-to-use plugins available. In this book, I will be discussing some of them.
The Elasticsearch plugins come in two flavors:
- Site plugins: These are the plugins that have a site (web app) in them and do not contain any Java-related content. After installation, they are moved to the site directory and can be accessed using
es_ip:port/_plugin/plugin_name
. - Java plugins: These mainly contain
.jar
files and are used to extend the functionalities of Elasticsearch. For example, the Carrot2 plugin that is used for text-clustering purposes.
Elasticsearch ships with a plugin script that is located in the /user/share/elasticsearch/bin
directory, and any plugin can be installed using this script in the following format:
bin/plugin --install plugin_url
Tip
Once the plugin is installed, you need to restart that node to make it active. In the following image, you can see the different plugins installed inside the Elasticsearch node. Plugins need to be installed separately on each node of the cluster.
The following is the layout of the plugin directory of Elasticsearch:
Checking for installed plugins
You can check the log of your node that shows the following line at start up time:
[2015-09-06 14:16:02,606][INFO ][plugins ] [Matt Murdock] loaded [clustering-carrot2, marvel], sites [marvel, carrot2, head]
Alternatively, you can use the following command:
curl XGET 'localhost:9200/_nodes/plugins'?pretty
Another option is to use the following URL in your browser:
http://localhost:9200/_nodes/plugins
Installing the Head plugin for Elasticsearch
The Head plugin is a web front for the Elasticsearch cluster that is very easy to use. This plugin offers various features such as showing the graphical representations of shards, the cluster state, easy query creations, and downloading query-based data in the CSV format.
The following is the command to install the Head plugin:
sudo /usr/share/elasticsearch/bin/plugin -install mobz/elasticsearch-head
Restart the Elasticsearch node with the following command to load the plugin:
sudo service elasticsearch restart
Once Elasticsearch is restarted, open the browser and type the following URL to access it through the Head plugin:
http://localhost:9200/_plugin/head
Note
More information about the Head plugin can be found here: https://github.com/mobz/elasticsearch-head
Installing Sense for Elasticsearch
Sense is an awesome tool to query Elasticsearch. You can add it to your latest version of Chrome, Safari, or Firefox browsers as an extension.
Now, when Elasticsearch is installed and running in your system, and you have also installed the plugins, you are good to go with creating your first index and performing some basic operations.