What this book covers
Chapter 1, Revisiting Elasticsearch and the Changes, guides you through how Apache Lucene works and will introduce you to Elasticsearch 5.x, describing the basic concepts and showing you the important changes in Elasticsearch from version 1.x to 5.x.
Chapter 2, The Improved Query DSL, describes the new default scoring algorithm, BM25, and how it would be better than the previous TF-IDF algorithm. In addition to that, it explains various Elasticsearch features, such as query rewriting, query templates, changes in query modules, and various queries to choose from in a given scenario.
Chapter 3, Beyond Full Text Search, describes queries about rescoring, multimatching control, and function score queries. In addition to that, this chapter covers the scripting module of Elasticsearch.
Chapter 4, Data Modeling and Analytics, discusses different approaches of data modeling in Elasticsearch and also covers how to handle relationships among documents using parent-child and nested data types, along with focusing on practical considerations. It further discusses the aggregation module of Elasticsearch for the purpose of data analytics.
Chapter 5, Improving the User Search Experience, focuses on topics for improving the user search experience using suggesters, which allows you to correct user-query spelling mistakes and build efficient autocomplete mechanisms. In addition to that, it covers how to improve query relevance and how to use synonyms to search.
Chapter 6, The Index Distribution Architecture, covers techniques for choosing the right amount of shards and replicas, how routing works, how shard allocation works, and how to alter its behavior. In addition to that, we discuss what query execution preference is and how it allows us to choose where the queries are going to be executed.
Chapter 7, Low-Level Index Control, describes how to alter Apache Lucene scoring and how to choose an alternative scoring algorithm. It also covers NRT searching and indexing and transaction log usage and allows you to understand segment merging and tune it for your use case along with the details about removed merge policies inside Elasticsearch 5.x. At the end of the chapter, you will also find information about IO throttling and Elasticsearch caching.
Chapter 8, Elasticsearch Administration, focuses on concepts related to administering Elasticsearch. It describes what the discovery, gateway, and recovery modules are, how to configure them, and why you should bother. We also describe what the cat API is and how to back up and restore your data to different cloud services (such as Amazon AWS and Microsoft Azure).
Chapter 9, Data Transformation and Federated Search, covers the latest feature of Elasticsearch 5, that is ingest node, which allows us to preprocess data into the Elasticsearch cluster itself before indexing. It further tells us about how federated search works with different clusters using tribe nodes.
Chapter 10, Improving Performance, discusses Elasticsearch performance improvements under different loads and what the right way of scaling production clusters is, along with covering the insights into garbage collections and hot threads issues and how to deal with them. It further covers query profiling and query benchmarking. In the end, it explains the general Elasticsearch cluster tuning advice under high query rate scenarios versus high indexing throughput scenarios.
Chapter 11, Developing Elasticsearch Plugins, covers Elasticsearch plugins' development by showing and describing in depth how to write your own REST action and language analysis plugin.
Chapter 12, Introducing Elastic Stack 5.0, introduces you to the components of Elastic Stack 5.0, covering Elasticsearch, Logstash, Kibana, and Beats.