Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Elasticsearch 8.x Cookbook
Elasticsearch 8.x Cookbook

Elasticsearch 8.x Cookbook: Over 180 recipes to perform fast, scalable, and reliable searches for your enterprise , Fifth Edition

eBook
€16.99 €18.99
Paperback
€23.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with Print?

Product feature icon Instant access to your digital copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Redeem a companion digital copy on all Print orders
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Table of content icon View table of contents Preview book icon Preview Book

Elasticsearch 8.x Cookbook

Chapter 2: Managing Mappings

Mapping is a primary concept in Elasticsearch that defines how the search engine should process a document and its fields to be effectively used in search and aggregations.

Search engines perform the following two main operations:

  • Indexing: This action is used to receive a document, process it, and store it in an index.
  • Searching: This action is used to retrieve the data from the index based on a query.

These two operations are strictly connected; an error in the indexing step leads to unwanted or missing search results.

Elasticsearch, by default, has explicit mapping at the index level. When indexing, if a mapping is not provided, a default one is created and guesses the structure from the JSON data fields that the document is composed of. This new mapping is then automatically propagated to all the cluster nodes: it will begin part of the cluster's state.

The default type mapping has sensible default values, but when you want to change their behavior or customize several other aspects of indexing (object to special fields, storing, ignoring, completion, and so on), you need to provide a new mapping definition.

In this chapter, we'll look at all the possible mapping field types that document mappings are composed of.

In this chapter, we will cover the following recipes:

  • Using explicit mapping creation
  • Mapping base types
  • Mapping arrays
  • Mapping an object
  • Mapping a document
  • Using dynamic templates in document mapping
  • Managing nested objects
  • Managing a child document with a join field
  • Adding a field with multiple mappings
  • Mapping a GeoPoint field
  • Mapping a GeoShape field
  • Mapping an IP field
  • Mapping an Alias field
  • Mapping a Percolator field
  • Mapping the Rank Feature and Feature Vector fields
  • Mapping the Search as you type field
  • Using the Range Field type
  • Using the Flattened field type
  • Using the Point and Shape field types
  • Using the Dense Vector field type
  • Using the Histogram field type
  • Adding metadata to a mapping
  • Specifying different analyzers
  • Using index components and templates

Technical requirements

To follow and test the commands shown in this chapter, you must have a working Elasticsearch cluster installed on your system, as described in Chapter 1, Getting Started.

To simplify how you manage and execute these commands, I suggest that you install Kibana so that you have a more advanced environment to execute Elasticsearch queries.

Using explicit mapping creation

If we consider the index as a database in the SQL world, mapping is similar to the create table definition.

Elasticsearch can understand the structure of the document that you are indexing (reflection) and create the mapping definition automatically. This is called explicit mapping creation.

Getting ready

To execute the code in this recipe, you will need an up-and-running Elasticsearch installation, as described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started.

To execute these commands, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar platforms. I suggest using the Kibana console to provide code completion and better character escaping for Elasticsearch.

To understand the examples and code in this recipe, basic knowledge of JSON is required.

How to do it…

You can explicitly create a mapping by adding a new document to Elasticsearch. For this, perform the following steps:

  1. Create an index, as shown in the following code:
    PUT test

The output will be as follows:

{ "acknowledged" : true, "shards_acknowledged" : true,
 "index" : "test" }
  1. Put a document in the index, as shown in the following code:
    PUT test/_doc/1
    {"name":"Paul", "age":35}

The output will be as follows:

{
  "_index" : "test", "_id" : "1", "_version" : 1,
  "result" : "created",
  "_shards" : {"total" : 2, "successful" : 1, "failed" : 0 },
  "_seq_no" : 0,  "_primary_term" : 1
}
  1. Get the mapping with the following code:
    GET test/_mapping
  2. The mapping that's auto-created by Elasticsearch should look as follows:
    {
      "test" : {
        "mappings" : {
          "properties" : {
            "age" : { "type" : "long" },
            "name" : {
              "type" : "text",
              "fields" : {
                "keyword" : {"type" : "keyword", "ignore_above" : 256 }
    } } } } } }
  3. To delete the index, you can use the following command:
    DELETE test

The output will be as follows:

{ "acknowledged" : true }

How it works…

The first command line (Step 1) creates an index where we can configure the mappings in the future, if required, and store documents in it.

The second command (Step 2) inserts a document in the index (we'll learn how to create the index in the Creating an index recipe of Chapter 3, Basic Operations, and record indexing in the Indexing a document recipe of Chapter 3, Basic Operations).

Elasticsearch reads all the default properties for the field of the mapping and starts to process them as follows:

  • If the field is already present in the mapping and the value of the field is valid (it matches the correct type), Elasticsearch does not need to change the current mappings.
  • If the field is already present in the mapping but the value of the field is of a different type, it tries to upgrade the field type (that is, from integer to long). If the types are not compatible, it throws an exception, and the indexing process fails.
  • If the field is not present, it tries to auto-detect the type of field. It updates the mappings with a new field mapping. (In the case of a null value, it skips the mapping update until it encounters a concrete type.)

There's more…

In Elasticsearch, every document has a unique identifier, called an ID for a single index, which is stored in the special _id field of the document.

The _id field can be provided at index time or can be assigned automatically by Elasticsearch if it is missing.

When a mapping type is created or changed, Elasticsearch automatically propagates mapping changes to all the nodes in the cluster so that all the shards are aligned to process that particular type.

In Elasticsearch 7.x, there was a default type (_doc): it was removed in Elasticsearch 8.x and above.

See also

Please refer to the following recipes in Chapter 3Basic Operations:

  • The Creating an index recipe, which is about putting new mappings in an index while it's being created
  • The Putting a mapping in an index recipe, which is about extending a mapping in an index

Mapping base types

Using explicit mapping makes it possible to start to quickly ingest the data using a schemaless approach without being concerned about field types. Thus, to achieve better results and performance in indexing, it's required to manually define a mapping.

Fine-tuning mapping brings some advantages, such as the following:

  • Reducing the index size on the disk (disabling functionalities for custom fields)
  • Indexing only interesting fields (general speed up)
  • Precooking data for fast search or real-time analytics (such as aggregations)
  • Correctly defining whether a field must be analyzed in multiple tokens or considered as a single token
  • Defining mapping types such as geo point, suggester, vectors, and so on

Elasticsearch allows you to use base fields with a wide range of configurations.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

To execute this recipe's examples, you will need to create an index with a test name, where you can put mappings, as explained in the Using explicit mapping creation recipe.

How to do it...

Let's use a semi real-world example of a shop order for our eBay-like shop:

  1. First, we must define an order:
Figure 2.1 – Example of an order

Figure 2.1 – Example of an order

  1. Our order record must be converted into an Elasticsearch mapping definition, as follows:
    PUT test/_mapping
    {  "properties" : {
          "id" : {"type" : "keyword"},
          "date" : {"type" : "date"},
          "customer_id" : {"type" : "keyword"},
          "sent" : {"type" : "boolean"},
          "name" : {"type" : "keyword"},
          "quantity" : {"type" : "integer"},
          "price" : {"type" : "double"},
          "vat" : {"type" : "double", "index": false}
    } }

Now, the mapping is ready to be put in the index. We will learn how to do this in the Putting a mapping in an index recipe of Chapter 3, Basic Operations.

How it works...

Field types must be mapped to one of the Elasticsearch base types, and options on how the field must be indexed need to be added.

The following table is a reference for the mapping types:

Figure 2.2 – Base type mapping

Figure 2.2 – Base type mapping

Depending on the data type, it's possible to give explicit directives to Elasticsearch when you're processing the field for better management. The most used options are as follows:

  • store (default false): This marks the field to be stored in a separate index fragment for fast retrieval. Storing a field consumes disk space but reduces computation if you need to extract it from a document (that is, in scripting and aggregations). The possible values for this option are true and false. They are always retuned as an array of values for consistency.

The stored fields are faster than others in aggregations.

  • index: This defines whether or not the field should be indexed. The possible values for this parameter are true and false. Index fields are not searchable (the default is true).
  • null_value: This defines a default value if the field is null.
  • boost: This is used to change the importance of a field (the default is 1.0).

boost works on a term level only, so it's mainly used in term, terms, and match queries.

  • search_analyzer: This defines an analyzer to be used during the search. If it's not defined, the analyzer of the parent object is used (the default is null).
  • analyzer: This sets the default analyzer to be used (the default is null).
  • norms: This controls the Lucene norms. This parameter is used to score queries better. If the field is only used for filtering, it's a best practice to disable it to reduce resource usage (true for analyzed fields and false for not_analyzed ones).
  • copy_to: This allows you to copy the content of a field to another one to achieve functionalities, similar to the _all field.
  • ignore_above: This allows you to skip the indexing string if it's bigger than its value. This is useful for processing fields for exact filtering, aggregations, and sorting. It also prevents a single term token from becoming too big and prevents errors due to the Lucene term's byte-length limit of 32,766. The maximum suggested value is 8191 (https://www.elastic.co/guide/en/elasticsearch/reference/current/ignore-above.html).

There's more...

From Elasticsearch version 6.x onward, as shown in the Using explicit mapping creation recipe, the explicit inferred type for a string is a multifield mapping:

  • The default processing is text. This mapping allows textual queries (that is, term, match, and span queries). In the example provided in the Using explicit mapping creation recipe, this was name.
  • The keyword subfield is used for keyword mapping. This field can be used for exact term matching and aggregation and sorting. In the example provided in the Using explicit mapping creation recipe, the referred field was name.keyword.

Another important parameter, available only for text mapping, is term_vector (the vector of terms that compose a string). Please refer to the Lucene documentation for further details at https://lucene.apache.org/core/8_7_0/core/org/apache/lucene/index/Terms.html.

term_vector can accept the following values:

  • no: This is the default value; that is, skip term vector.
  • yes: This is the store term vector.
  • with_offsets: This is the store term vector with a token offset (start, end position in a block of characters).
  • with_positions: This is used to store the position of the token in the term vector.
  • with_positions_offsets: This stores all the term vector data.
  • with_positions_payloads: This is used to store the position and payloads of the token in the term vector.
  • with_positions_offsets_payloads: This stores all the term vector data with payloads.

Term vectors allow fast highlighting but consume disk space due to storing additional text information. It's a best practice to only activate it in fields that require highlighting, such as title or document content.

See also

You can refer to the following sources for further details on the concepts of this chapter:

  • The online documentation on Elasticsearch provides a full description of all the properties for the different mapping fields at https://www.elastic.co/guide/en/elasticsearch/reference/master/mapping-params.html.
  • The Specifying a different analyzer recipe at the end of this chapter shows alternative analyzers to the standard one.
  • For newcomers who want to explore the concepts of tokenization, I would suggest reading the official Elasticsearch documentation at https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-tokenizers.html.

Mapping arrays

Array or multi-value fields are very common in data models (such as multiple phone numbers, addresses, names, aliases, and so on), but they're not natively supported in traditional SQL solutions.

In SQL, multi-value fields require you to create accessory tables that must be joined to gather all the values, leading to poor performance when the cardinality of the records is huge.

Elasticsearch, which works natively in JSON, provides support for multi-value fields transparently.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

To use an Array type in our mapping, perform the following steps:

  1. Every field is automatically managed as an array. For example, to store tags for a document, the mapping would be as follows:
    {  "properties" : {
          "name" : {"type" : "keyword"},
          "tag" : {"type" : "keyword", "store" : true},
          ...
    }
  2. This mapping is valid for indexing both documents. The following is the code for document1:
    {"name": "document1", "tag": "awesome"}
  3. The following is the code for document2:
    {"name": "document2", "tag": ["cool", "awesome", "amazing"] }

How it works…

Elasticsearch transparently manages the array: there is no difference if you declare a single value or a multi-value due to its Lucene core nature.

Multi-values for fields are managed in Lucene, so you can add them to a document with the same field name. For people with a SQL background, this behavior may be quite strange, but this is a key point in the NoSQL world as it reduces the need for a join query and creates different tables to manage multi-values. An array of embedded objects has the same behavior as simple fields.

Mapping an object

The object type is one of the most common field aggregation structures in documental databases.

An object is a base structure (analogous to a record in SQL): in JSON types, they are defined as key/value pairs inside the {} symbols.

Elasticsearch extends the traditional use of objects (which are flat in DBMS), thus allowing for recursive embedded objects.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. Again, I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

We can rewrite the mapping code from the previous recipe using an array of items:

PUT test/_doc/_mapping
{ "properties" : {
      "id" : {"type" : "keyword"},
      "date" : {"type" : "date"},
      "customer_id" : {"type" : "keyword", "store" : true},
      "sent" : {"type" : "boolean"},
      "item" : {
        "type" : "object",
        "properties" : {
          "name" : {"type" : "text"},
          "quantity" : {"type" : "integer"},
          "price" : {"type" : "double"},
          "vat" : {"type" : "double"}
} } } }

How it works…

Elasticsearch speaks native JSON, so every complex JSON structure can be mapped in it.

When Elasticsearch is parsing an object type, it tries to extract fields and processes them as its defined mapping. If not, it learns the structure of the object using reflection.

The most important attributes of an object are as follows:

  • properties: This is a collection of fields or objects (we can consider them as columns in the SQL world).
  • enabled: This establishes whether or not the object should be processed. If it's set to false, the data contained in the object is not indexed and it cannot be searched (the default is true).
  • dynamic: This allows Elasticsearch to add new field names to the object using a reflection on the values of the inserted data. If it's set to false, when you try to index an object containing a new field type, it'll be rejected silently. If it's set to strict, when a new field type is present in the object, an error will be raised, skipping the indexing process. The dynamic parameter allows you to be safe about making changes to the document's structure (the default is true).

The most used attribute is properties, which allows you to map the fields of the object in Elasticsearch fields.

Disabling the indexing part of the document reduces the index size; however, the data cannot be searched. In other words, you end up with a smaller file on disk, but there is a cost in terms of functionality.

See also

Some special objects are described in the following recipes:

  • The Mapping a document recipe
  • The Managing a child document with a join field recipe
  • The Mapping nested objects recipe

Mapping a document

The document mapping is also referred to as the root object. This has special parameters that control its behavior, and they are mainly used internally to do special processing, such as routing or time-to-live of documents.

In this recipe, we'll look at these special fields and learn how to use them.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

We can extend the preceding order example by adding some of the special fields, like so:

PUT test/_mapping
{ "_source": { "store": true },
    "_routing": { "required": true },
    "_index": { "enabled": true },
    "properties": {} }

How it works…

Every special field has parameters and value options, such as the following:

  • _id: This allows you to index only the ID part of the document. All the ID queries will speed up using the ID value (by default, this is not indexed and not stored).
  • _index: This controls whether or not the index must be stored as part of the document. It can be enabled by setting the "enabled": true parameter (enabled=false is the default).
  • _source: This controls how the document's source is stored. Storing the source is very useful, but it's a storage overhead, so it is not required. Consequently, it's better to turn it off (enabled=true is the default).
  • _routing: This defines the shard that will store the document. It supports additional parameters, such as required (true/false). This is used to force the presence of the routing value, raising an exception if it's not provided.

Controlling how to index and process a document is very important and allows you to resolve issues related to complex data types.

Every special field has parameters to set particular configurations, and some of their behaviors could change in different releases of Elasticsearch.

See also

Please refer to the Using dynamic templates in document mapping recipe in this chapter and the Putting a mapping in an index recipe of Chapter 3Basic Operations, to learn more.

Using dynamic templates in document mapping

In the Using explicit mapping creation recipe, we saw how Elasticsearch can guess the field type using reflection. In this recipe, we'll see how we can help it improve its guessing capabilities via dynamic templates.

The dynamic template feature is very useful. For example, it may be useful in situations where you need to create several indices with similar types because it allows you to move the need to define mappings from coded initial routines to automatic index-document creation. Typical usage is to define types for Logstash log indices.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

We can extend the previous mapping by adding document-related settings, as follows:

PUT test/_mapping
{
    "dynamic_date_formats":["yyyy-MM-dd", "dd-MM-yyyy"],\
    "date_detection": true,
    "numeric_detection": true,
    "dynamic_templates":[
      {"template1":{
        "match":"*",
        "match_mapping_type": "long",
        "mapping": {"type":" {dynamic_type}", "store": true}
      }}    ],
    "properties" : {...}
}

How it works…

The root object (document) controls the behavior of its fields and all its children object fields. In document mapping, we can define the following:

  • date_detection: This allows you to extract a date from a string (true is the default).
  • dynamic_date_formats: This is a list of valid date formats. This is used if date_detection is active.
  • numeric_detection: This enables you to convert strings into numbers, if possible (false is the default).
  • dynamic_templates: This is a list of templates that are used to change the explicit mapping inference. If one of these templates is matched, the rules that have been defined in it are used to build the final mapping.

A dynamic template is composed of two parts: the matcher and the mapping.

To match a field to activate the template, you can use several types of matchers, such as the following:

  • match: This allows you to define a match on the field name. The expression is a standard GLOB pattern (http://en.wikipedia.org/wiki/Glob_(programming)).
  • unmatch: This allows you to define the expression to be used to exclude matches (optional).
  • match_mapping_type: This controls the types of the matched fields; for example, string, integer, and so on (optional).
  • path_match: This allows you to match the dynamic template against the full dot notation of the field; for example, obj1.*.value (optional).
  • path_unmatch: This will do the opposite of path_match, excluding the matched fields (optional).
  • match_pattern: This allows you to switch the matchers to regex (regular expression); otherwise, the glob pattern match is used (optional).

The dynamic template mapping part is a standard one but can use special placeholders, such as the following:

  • {name}: This will be replaced with the actual dynamic field name.
  • {dynamic_type}: This will be replaced with the type of the matched field.

The order of the dynamic templates is very important; only the first one that is matched is executed. It is good practice to order the ones with more strict rules first, and then the others.

There's more...

Dynamic templates are very handy when you need to set a mapping configuration to all the fields. This can be done by adding a dynamic template, similar to this one:

"dynamic_templates" : [
  { "store_generic" : {
      "match" : "*", "mapping" : { "store" : true }
} } ]  

In this example, all the new fields, which will be added with explicit mapping, will be stored.

See also

  • You can find the default Elasticsearch behavior for creating a mapping in the Using explicit mapping creation recipe and the base way of defining a mapping in the Mapping a document recipe.
  • The glob pattern is available at http://en.wikipedia.org/wiki/Glob_pattern.

Managing nested objects

There is a special type of embedded object called a nested object. This resolves a problem related to Lucene's indexing architecture, in which all the fields of embedded objects are viewed as a single object (technically speaking, they are flattened). During the search, in Lucene, it is not possible to distinguish between values and different embedded objects in the same multi-valued array.

If we consider the previous order example, it's not possible to distinguish an item's name and its quantity with the same query since Lucene puts them in the same Lucene document object. We need to index them in different documents and then join them. This entire trip is managed by nested objects and nested queries.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

A nested object is defined as a standard object with the nested type.

Regarding the example in the Mapping an object recipe, we can change the type from object to nested, as follows:

PUT test/_mapping
{ "properties" : {
      "id" : {"type" : "keyword"},
      "date" : {"type" : "date"},
      "customer_id" : {"type" : "keyword"},
      "sent" : {"type" : "boolean"},
      "item" : {"type" : "nested",
        "properties" : {
            "name" : {"type" : "keyword"},
            "quantity" : {"type" : "long"},
            "price" : {"type" : "double"},
            "vat" : {"type" : "double"}
} } } }

How it works…

When a document is indexed, if an embedded object has been marked as nested, it's extracted by the original document before being indexed in a new external document and saved in a special index position near the parent document.

In the preceding example, we reused the mapping from the Mapping an object recipe, but we changed the type of the item from object to nested. No other action must be taken to convert an embedded object into a nested one.

The nested objects are special Lucene documents that are saved in the same block of data as its parent – this approach allows for fast joining with the parent document.

Nested objects are not searchable with standard queries, only with nested ones. They are not shown in standard query results.

The lives of nested objects are related to their parents: deleting/updating a parent automatically deletes/updates all the nested children. Changing the parent means Elasticsearch will do the following:

  • Mark old documents as deleted.
  • Mark all nested documents as deleted.
  • Index the new document version.
  • Index all nested documents.

There's more...

Sometimes, you must propagate information about the nested objects to their parent or root objects. This is mainly to build simpler queries about the parents (such as terms queries without using nested ones). To achieve this, two special properties of nested objects must be used:

  • include_in_parent: This makes it possible to automatically add the nested fields to the immediate parent.
  • include_in_root: This adds the nested object fields to the root object.

These settings add data redundancy, but they reduce the complexity of some queries, thus improving performance.

See also

  • Nested objects require a special query to search for them – this will be discussed in the Using nested queries recipe of Chapter 6, Relationships and Geo Queries.
  • The Managing a child document with a join field recipe shows another way to manage child/parent relationships between documents.

Managing a child document with a join field

In the previous recipe, we saw how it's possible to manage relationships between objects with the nested object type. The disadvantage of nested objects is their dependence on their parents. If you need to change the value of a nested object, you need to reindex the parent (this causes a potential performance overhead if the nested objects change too quickly). To solve this problem, Elasticsearch allows you to define child documents.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

In the following example, we have two related objects: an Order and an Item.

Their UML representation is as follows:

Figure 2.3 – UML example of an Order/Item relationship

Figure 2.3 – UML example of an Order/Item relationship

The final mapping should merge the field definitions of both Order and Item, as well as use a special field (join_field, in this example) that takes the parent/child relationship.

To use join_field, follow these steps:

  1. First, we must define the mapping, as follows:
    PUT test1/_mapping
    { "properties": {
        "join_field": {
          "type": "join", "relations": { "order": "item" }
        },
        "id": { "type": "keyword" },
        "date": { "type": "date" },
        "customer_id": { "type": "keyword" },
        "sent": { "type": "boolean" },
        "name": { "type": "text" },
        "quantity": { "type": "integer" },
        "vat": { "type": "double" }
    } }

The preceding mapping is very similar to the one in the previous recipe.

  1. If we want to store the joined records, we will need to save the parent first and then the children, like so:
    PUT test/_doc/1?refresh
    { "id": "1", "date": "2018-11-16T20:07:45Z", "customer_id": "100", "sent": true, "join_field": "order" }
    PUT test/_doc/c1?routing=1&refresh
     { "name": "tshirt", "quantity": 10, "price": 4.3, "vat": 8.5,
       "join_field": { "name": "item", "parent": "1" } }

The child item requires special management because we need to add routing with the parent (1 in the preceding example). Furthermore, we need to specify the parent name and its ID in the object.

How it works…

Mapping, in the case of multiple item relationships in the same index, needs to be computed as the sum of all the other mapping fields.

The relationship between objects must be defined in join_field.

There must only be a single join_field for mapping; if you need to provide a lot of relationships, you can provide them in the relations object.

The child document must be indexed in the same shard as the parent; so, when indexed, an extra parameter must be passed, which is routing (we'll learn how to do this in the Indexing a document recipe in Chapter 3, Basic Operations).

A child document doesn't need to reindex the parent document when we want to change its values. Consequently, it's fast in terms of indexing, reindexing (updating), and deleting.

There's more...

In Elasticsearch, we have different ways to manage relationships between objects, as follows:

  • Embedding with type=object: This is implicitly managed by Elasticsearch and it considers the embedding as part of the main document. It's fast, but you need to reindex the main document to change the value of the embedded object.
  • Nesting with type=nested: This allows you to accurately search and filter the parent by using nested queries on children. Everything works for the embedded object except for the query (you must use a nested query to search for them).
  • External children documents: Here, the children are the external document, with a join_field property to bind them to the parent. They must be indexed in the same shard as the parent. The join with the parent is a bit slower than the nested one. This is because the nested objects are in the same data block as the parent in the Lucene index and they are loaded with the parent; otherwise, the child document requires more read operations.

Choosing how to model the relationship between objects depends on your application scenario.

Tip

There is also another approach that can be used, but on big data documents, it creates poor performance – decoupling a join relationship. You do the join query in two steps: first, collect the ID of the children/other documents and then search for them in a field of their parent.

See also

Please refer to the Using the has_child query, Using the top_children query, and Using the has_parent query recipes of Chapter 6, Relationships and Geo Queries, for more details on child/parent queries.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Explore the capabilities of Elasticsearch 8.x with easy-to-follow recipes
  • Extend the Elasticsearch functionalities and learn how to deploy on Elastic Cloud
  • Deploy and manage simple Elasticsearch nodes as well as complex cluster topologies

Description

Elasticsearch is a Lucene-based distributed search engine at the heart of the Elastic Stack that allows you to index and search unstructured content with petabytes of data. With this updated fifth edition, you'll cover comprehensive recipes relating to what's new in Elasticsearch 8.x and see how to create and run complex queries and analytics. The recipes will guide you through performing index mapping, aggregation, working with queries, and scripting using Elasticsearch. You'll focus on numerous solutions and quick techniques for performing both common and uncommon tasks such as deploying Elasticsearch nodes, using the ingest module, working with X-Pack, and creating different visualizations. As you advance, you'll learn how to manage various clusters, restore data, and install Kibana to monitor a cluster and extend it using a variety of plugins. Furthermore, you'll understand how to integrate your Java, Scala, Python, and big data applications such as Apache Spark and Pig with Elasticsearch and create efficient data applications powered by enhanced functionalities and custom plugins. By the end of this Elasticsearch cookbook, you'll have gained in-depth knowledge of implementing the Elasticsearch architecture and be able to manage, search, and store data efficiently and effectively using Elasticsearch.

Who is this book for?

If you’re a software engineer, big data infrastructure engineer, or Elasticsearch developer, you'll find this Elasticsearch book useful. The book will also help data professionals working in e-commerce and FMCG industries who use Elastic for metrics evaluation and search analytics to gain deeper insights and make better business decisions. Prior experience with Elasticsearch will help you get the most out of this book.

What you will learn

  • Become well-versed with the capabilities of X-Pack
  • Optimize search results by executing analytics aggregations
  • Get to grips with using text and numeric queries as well as relationship and geo queries
  • Install Kibana to monitor clusters and extend it for plugins
  • Build complex queries by managing indices and documents
  • Monitor the performance of your cluster and nodes
  • Design advanced mapping to take full control of index steps
  • Integrate Elasticsearch in Java, Scala, Python, and big data applications
Estimated delivery fee Deliver to Austria

Premium delivery 7 - 10 business days

€17.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : May 27, 2022
Length: 750 pages
Edition : 5th
Language : English
ISBN-13 : 9781801079815
Vendor :
Elastic
Category :
Languages :

What do you get with Print?

Product feature icon Instant access to your digital copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Redeem a companion digital copy on all Print orders
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Estimated delivery fee Deliver to Austria

Premium delivery 7 - 10 business days

€17.95
(Includes tracking information)

Product Details

Publication date : May 27, 2022
Length: 750 pages
Edition : 5th
Language : English
ISBN-13 : 9781801079815
Vendor :
Elastic
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 114.97
Getting Started with Elastic Stack 8.0
€41.99
The Kubernetes Bible
€48.99
Elasticsearch 8.x Cookbook
€23.99
Total 114.97 Stars icon

Table of Contents

19 Chapters
Chapter 1: Getting Started Chevron down icon Chevron up icon
Chapter 2: Managing Mappings Chevron down icon Chevron up icon
Chapter 3: Basic Operations Chevron down icon Chevron up icon
Chapter 4: Exploring Search Capabilities Chevron down icon Chevron up icon
Chapter 5: Text and Numeric Queries Chevron down icon Chevron up icon
Chapter 6: Relationships and Geo Queries Chevron down icon Chevron up icon
Chapter 7: Aggregations Chevron down icon Chevron up icon
Chapter 8: Scripting in Elasticsearch Chevron down icon Chevron up icon
Chapter 9: Managing Clusters Chevron down icon Chevron up icon
Chapter 10: Backups and Restoring Data Chevron down icon Chevron up icon
Chapter 11: User Interfaces Chevron down icon Chevron up icon
Chapter 12: Using the Ingest Module Chevron down icon Chevron up icon
Chapter 13: Java Integration Chevron down icon Chevron up icon
Chapter 14: Scala Integration Chevron down icon Chevron up icon
Chapter 15: Python Integration Chevron down icon Chevron up icon
Chapter 16: Plugin Development Chevron down icon Chevron up icon
Chapter 17: Big Data Integration Chevron down icon Chevron up icon
Chapter 18: X-Pack Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(6 Ratings)
5 star 50%
4 star 33.3%
3 star 0%
2 star 0%
1 star 16.7%
Filter icon Filter
Top Reviews

Filter reviews by




Ronnie Watson Jun 02, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Reading this book gave me some insights on how I can configure Elasticsearch so that I am able to manage a clusters data or provision the stack for business use.
Amazon Verified review Amazon
Rishav Rohit Jul 06, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I have been using Elasticsearch in various projects for sometime. By following this cookbook I was able to explore many exciting features of Elasticsearch like suggesting a correct query, parent-child queries, ingest pipelines, integration with Apache Spark, etc. I can already think of few use cases in my project which can be benefited with what I learnt from this book. I would strongly recommend this book to my colleagues working with Elasticsearch too.
Amazon Verified review Amazon
Andrew Anderson Jun 01, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Lucky to have a month to dive into this practical guide before official release, and I couldn't be more surprised at the detail and variety of walkthroughs that are covered. Many best practices and operational examples are provided, covering from routine to advanced tasks involved with administration and optimization of a running production cluster, or clusters. The Kibana UX is standardizing, and though costly, there are pertinent screenshots and code examples of all recipes throughout! I will forever keep this book as a reference, learning tool, and refresher given whatever the state of a cluster I am working on. Thanks for the copy Packt!
Amazon Verified review Amazon
Amazon Customer Jun 13, 2022
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I'm impressed with how the cookbook was put together, it explains the reader step by step how to setup, configure and optimize a running environment. I know each situation is different and someone can argue that a there's no 'recipe' for every situation. But this book gives you a very good understanding (through examples) of how things work and how you may apply them in your product. Long story short, I have been working with ES for the last 7 years and I this book gave me some insight that I highly appreciate.
Amazon Verified review Amazon
Arunachalam Lakshmanan Jul 01, 2022
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
As in many cookbooks, it gives a precise guide on what the feature is, how it can be done and how it works. "There's more.." section of each recipe is interesting. But there are some advanced topics like percolator could be handled better, as the original documentation of elasticsearch is not as good as other topics. It's a great reference guide for several elasticsearch features of 8.0It's a new edition of the previous book. I was expecting a "what's new with 8.0" section
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the digital copy I get with my Print order? Chevron down icon Chevron up icon

When you buy any Print edition of our Books, you can redeem (for free) the eBook edition of the Print Book you’ve purchased. This gives you instant access to your book when you make an order via PDF, EPUB or our online Reader experience.

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela