Using the native protocol
Elasticsearch provides a native protocol, used mainly for low-level communication between nodes, but is very useful for fast importing of huge data blocks. This protocol is available only for JVM languages and is commonly used in Java, Groovy, and Scala.
Getting ready
You need a working Elasticsearch cluster--the standard port for native protocol is 9300
.
How to do it...
The steps required to use the native protocol in a Java environment are as follows (in Chapter 14, Java Integration we'll discuss it in detail):
Before starting, we must be sure that Maven loads the Elasticsearch JAR adding to the
pom.xml
lines:<dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>5.0</version> </dependency>
Depending on Elasticsearch JAR, creating a Java client, it's quite easy:
import org.elasticsearch.common.settings.Settings; import org.elasticsearch.client.Client; import org.elasticsearch.client.transport.TransportClient; ... Settings settings = Settings.settingsBuilder() .put("client.transport.sniff", true).build(); // we define a new settings // using sniff transport allows to autodetect other nodes Client client = TransportClient.builder() .settings(settings).build().addTransportAddress (new InetSocketTransportAddress("127.0.0.1", 9300)); // a client is created with the settings
How it works...
To initialize a native client, some settings are required to properly configure it. The important ones are:
cluster.name
: It is the name of the clusterclient.transport.sniff
: It allows to sniff the rest of the cluster topology and adds discovered nodes into the client list of machines to use
With these settings, it's possible to initialize a new client giving an IP address and port (default 9300
).
There's more...
This is the internal protocol used in Elasticsearch: it's the fastest protocol available to talk with Elasticsearch.
The native protocol is an optimized binary one and works only for JVM languages. To use this protocol, you need to include elasticsearch.jar
in your JVM project. Because it depends on Elasticsearch implementation, it must be the same version of the Elasticsearch cluster.
Note
Every time you update Elasticsearch, you need to update the elasticsearch.jar on which it depends, and if there are internal API changes, you need to update your code.
To use this protocol, you also need to study the internals of Elasticsearch, so it's not so easy to use as HTTP protocol.
Native protocol is very useful for massive data import. But as Elasticsearch is mainly thought as a REST HTTP server to communicate with, it lacks support for everything is not standard in Elasticsearch core, such as plugins entry points. Using this protocol, you are unable to call entry points made by external plugins in an easy way.
Note
The native protocol seems easier to integrate in a Java/JVN project, but due to its nature that follows the fast release cycles of Elasticsearch, its API could change often even for minor release upgrades and your code will be broken.
See also
The native protocol is the most used in the Java world and it will be deeply discussed in Chapters 14, Java Integration, Chapters 15, Scala Integration, and Chapter 17, Plugin Development.
For further details on Elasticsearch Java API, they are available on Elasticsearch site at https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/index.html.