Indexing data with Apache Lucene
In this recipe, we will demonstrate how to index a large amount of data with Apache Lucene. Indexing is the first step for searching data fast. In action, Lucene uses an inverted full-text index. In other words, it considers all documents, splits them into words or tokens, and then builds an index for each token so that it knows in advance exactly which document to look for if a term is searched.
Getting ready
The following are the steps to be implemented:
- To download Apache Lucene, go to http://lucene.apache.org/core/downloads.html, and click on the Download button. At the time of writing, the latest version of Lucene was 6.4.1. Once you click on the Download button, it will take you to the mirror websites that host the distribution:
- Choose any appropriate mirror for downloading. Once you click a mirror website, it will take you to a directory of distribution. Download the
lucene-6.4.1.zip
file onto your system: - Once you download it, unzip the...