Executing a scroll/scan search
Pagination with a standard query works very well if you are matching documents that do not change too often; otherwise, doing pagination with live data returns unpredictable results. To bypass this problem, ElasticSearch provides an extra parameter in the query called scroll.
Getting ready
You will need a working ElasticSearch cluster and a working copy of Maven.
The code of this recipe is in chapter_10/nativeclient
in the code bundle, present on Packt's website and on GitHub (https://github.com/aparo/elasticsearch-cookbook-second-edition). The referred class is ScrollScanQueryExample
.
How to do it...
The search is done in the same way as in the previous recipe. The main difference is a setScroll
timeout that allows storing the result's ids for a query for a defined timeout in memory.
We can change the code of the previous recipe to use scroll in the following way:
import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.action.search...