You're reading from Elasticsearch 8.x Cookbook Over 180 recipes to perform fast, scalable, and reliable searches for your enterprise

Product type Paperback

Published in May 2022

Publisher Packt

ISBN-13 9781801079815

Length 750 pages

Edition 5th Edition

Languages

Java

Tools

Elasticsearch

Concepts

Enterprise Search

Author (1):

Alberto Paro

View More author details

Table of Contents (20) Chapters

Preface

1. Chapter 1: Getting Started

2. Chapter 2: Managing Mappings FREE CHAPTER

3. Chapter 3: Basic Operations

4. Chapter 4: Exploring Search Capabilities

5. Chapter 5: Text and Numeric Queries

6. Chapter 6: Relationships and Geo Queries

7. Chapter 7: Aggregations

8. Chapter 8: Scripting in Elasticsearch

9. Chapter 9: Managing Clusters

10. Chapter 10: Backups and Restoring Data

11. Chapter 11: User Interfaces

12. Chapter 12: Using the Ingest Module

13. Chapter 13: Java Integration

14. Chapter 14: Scala Integration

15. Chapter 15: Python Integration

16. Chapter 16: Plugin Development

17. Chapter 17: Big Data Integration

18. Chapter 18: X-Pack

19. Other Books You May Enjoy

Managing nested objects

There is a special type of embedded object called a nested object. This resolves a problem related to Lucene's indexing architecture, in which all the fields of embedded objects are viewed as a single object (technically speaking, they are flattened). During the search, in Lucene, it is not possible to distinguish between values and different embedded objects in the same multi-valued array.

If we consider the previous order example, it's not possible to distinguish an item's name and its quantity with the same query since Lucene puts them in the same Lucene document object. We need to index them in different documents and then join them. This entire trip is managed by nested objects and nested queries.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

A nested object is defined as a standard object with the nested type.

Regarding the example in the Mapping an object recipe, we can change the type from object to nested, as follows:

PUT test/_mapping
{ "properties" : {
      "id" : {"type" : "keyword"},
      "date" : {"type" : "date"},
      "customer_id" : {"type" : "keyword"},
      "sent" : {"type" : "boolean"},
      "item" : {"type" : "nested",
        "properties" : {
            "name" : {"type" : "keyword"},
            "quantity" : {"type" : "long"},
            "price" : {"type" : "double"},
            "vat" : {"type" : "double"}
} } } }

How it works…

When a document is indexed, if an embedded object has been marked as nested, it's extracted by the original document before being indexed in a new external document and saved in a special index position near the parent document.

In the preceding example, we reused the mapping from the Mapping an object recipe, but we changed the type of the item from object to nested. No other action must be taken to convert an embedded object into a nested one.

The nested objects are special Lucene documents that are saved in the same block of data as its parent – this approach allows for fast joining with the parent document.

Nested objects are not searchable with standard queries, only with nested ones. They are not shown in standard query results.

The lives of nested objects are related to their parents: deleting/updating a parent automatically deletes/updates all the nested children. Changing the parent means Elasticsearch will do the following:

Mark old documents as deleted.
Mark all nested documents as deleted.
Index the new document version.
Index all nested documents.

There's more...

Sometimes, you must propagate information about the nested objects to their parent or root objects. This is mainly to build simpler queries about the parents (such as terms queries without using nested ones). To achieve this, two special properties of nested objects must be used:

include_in_parent: This makes it possible to automatically add the nested fields to the immediate parent.
include_in_root: This adds the nested object fields to the root object.

These settings add data redundancy, but they reduce the complexity of some queries, thus improving performance.