The common terms query
When the user is searching some text with a query, not all the terms that the user uses have the same importance. The more common terms are generally removed for query execution, for reducing the noise generated by them: these terms are called stopwords and they are generally articles, conjunctions, and common language words (that is, the
, a
, so
, and
, or
, and so on).
The list of stopwords depends on the language and is independent from your documents. Lucene provides ways to dynamically compute the stopwords list based on your indexed document a query time via the common terms query.
Getting ready
You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in this Chapter 2, Downloading and Setup.
To execute curl
via the command line, you need to install curl
for your operative system.
To correctly execute the following commands, you need an index populated with the chapter_05/populate_query.sh
script available...