Batch indexing to speed up your indexing process
In Chapter 1, Getting Started with Elasticsearch Cluster, we saw how to index a particular document into Elasticsearch. It required opening an HTTP connection, sending the document, and closing the connection. Of course, we were not responsible for most of that as we used the curl
command, but in the background this is what happened. However, sending the documents one by one is not efficient. Because of that, it is now time to find out how to index a large number of documents in a more convenient and efficient way than doing so one by one.
Preparing data for bulk indexing
Elasticsearch allows us to merge many requests into one package. This package can be sent as a single request. What's more, we are not limited to having a single type of request in the so called bulk – we can mix different types of operations together, which include:
- Adding or replacing the existing documents in the index (
index
) - Removing documents from the index...