Using the ingest attachment plugin
It's easy to make a cluster irresponsive in Elasticsearch prior to 5.x, using the attachment mapper. The metadata extraction from a document requires a very high CPU operation and if you are ingesting a lot of documents, your cluster is under load.
To prevent this scenario, Elasticsearch introduces the ingest node. An ingest node can be held under very high pressure without causing problems to the rest of the Elasticsearch cluster.
The attachment processor allows us to use the document extraction capabilities of Tika in an ingest node.
Getting ready
You need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via the command line, you need to install curl
for your operative system.
How to do it...
To be able to use the ingest attachment processor, perform the following steps:
You need to install it as a plugin via:
bin/elasticsearch-plugin...