Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Elastic Stack 8.x Cookbook

You're reading from   Elastic Stack 8.x Cookbook Over 80 recipes to perform ingestion, search, visualization, and monitoring for actionable insights

Arrow left icon
Product type Paperback
Published in Jun 2024
Publisher Packt
ISBN-13 9781837634293
Length 688 pages
Edition 1st Edition
Arrow right icon
Authors (2):
Arrow left icon
Yazid Akadiri Yazid Akadiri
Author Profile Icon Yazid Akadiri
Yazid Akadiri
Huage Chen Huage Chen
Author Profile Icon Huage Chen
Huage Chen
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Chapter 1: Getting Started – Installing the Elastic Stack 2. Chapter 2: Ingesting General Content Data FREE CHAPTER 3. Chapter 3: Building Search Applications 4. Chapter 4: Timestamped Data Ingestion 5. Chapter 5: Transform Data 6. Chapter 6: Visualize and Explore Data 7. Chapter 7: Alerting and Anomaly Detection 8. Chapter 8: Advanced Data Analysis and Processing 9. Chapter 9: Vector Search and Generative AI Integration 10. Chapter 10: Elastic Observability Solution 11. Chapter 11: Managing Access Control 12. Chapter 12: Elastic Stack Operation 13. Chapter 13: Elastic Stack Monitoring 14. Index 15. Other Books You May Enjoy

Defining index mapping

In Elasticsearch, mapping refers to the process of defining the schema or structure of an index. It defines how documents and their fields are stored and indexed within Elasticsearch. Mapping allows you to specify the data type of each field, such as text, a keyword, a numeric character, and a date, and configure various properties for each field, including indexing options and analyzers. By defining a mapping, you provide Elasticsearch with crucial information about the data you intend to index, enabling it to efficiently store, search, and analyze the documents.

Mapping plays a critical role in delivering precise search results, efficient data storage, and effective handling of different data types within Elasticsearch.

When no mapping is predefined, Elasticsearch attempts to dynamically infer data types and create the mapping; this is what has occurred with our movie dataset thus far.

In this recipe, we will apply an explicit mapping to the movies index.

Getting ready

Make sure that you have completed the Updating data in Elasticsearch recipe.

All the command snippets for the Dev Tools in this recipe are available at https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter2/snippets.md#defining-index-mapping.

How to do it…

You can define mappings during index creation or update them in an existing index.

An important note on mappings

When updating the mapping of an existing index that already contains documents, the mapping of those existing documents will not change. The new mapping will only apply to documents indexed afterward.

In this recipe, you are going to create a new index with explicit mapping, and then re-index the data from the movie index, assuming that you have already created that index beforehand:

  1. Head to Kibana | Dev Tools.
  2. Next, let’s check the mapping of the previously created index with the following command:
    GET /movies/_mapping

    You will get the results shown in the following figure. Note that, for readability, some fields were collapsed.

Figure 2.13 – The default mapping on the movies index

Figure 2.13 – The default mapping on the movies index

Let’s review what’s going on in the figure:

a. Examining the current mapping of the genre field reveals a multi-field mapping technique. This approach allows a single field to be indexed in several ways to serve different purposes. For example, the genre field is indexed both as a text field for full-text search and as a keyword field for sorting and aggregation. This dual approach to mapping the genre field is actually beneficial and well-suited for its intended use cases.

b. Examining the release_year field reveals that indexing it as a text field is not optimal, since it represents numerical data, which could be beneficial for range queries, as well as other numeric-specific operations. Retaining the keyword mapping for this field is advantageous for sorting and aggregation purposes. To address this, applying an explicit mapping to treat release_year appropriately as a numerical field is the next step.

c. There are two other fields that will require mapping adjustments – plot and cast. Given their nature, these fields should be indexed solely as text, considering it is unlikely there will be a need to sort or aggregate on these fields. However, this indexing strategy still allows for effective searching against them.

  1. Now, let’s create a new index with the correct explicit mapping for the cast, plot, and release_year fields:
    PUT movies-with-explicit-mapping
    {
      "mappings": {
        "properties": {
          "release_year": {
            "type": "short",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "cast": {
            "type": "text"
          },
          "plot": {
            "type": "text"
          }
        }
      }
    }
  2. Next, reindex the original data in the new index so that the new mapping is applied:
    POST /_reindex
    {
      "source": {
        "index": "movies"
      },
      "dest": {
        "index": "movies-with-explicit-mapping"
      }
    }
  3. Check whether the new mapping has been applied to the new index:
    GET movies-with-explicit-mapping/_mapping

    Figure 2.14 shows the explicit mapping applied to the index:

Figure 2.14 – Explicit mapping

Figure 2.14 – Explicit mapping

How it works...

Explicit mapping in Elasticsearch allows you to define the schema or mapping for your index explicitly. Instead of relying on dynamic mapping, which automatically detects and creates the mapping based on the first indexed document, explicit mapping gives you full control over the data types, field properties, and analysis settings for each field in your index, as shown in Figure 2.15:

Figure 2.15 – The field mapping options

Figure 2.15 – The field mapping options

There’s more…

Mapping is a key aspect of data modeling in Elasticsearch. Avoid relying on dynamic mapping and try, when possible, to explicitly define your mappings to have better control over the field types, properties, and analysis settings. This helps maintain consistency and avoids unexpected field mappings.

You should consider using multi-field mapping to index the same field in different ways, depending on the use cases. For instance, for a full-text search of a string field, text mapping is necessary. If the same string field is mostly used for aggregations, filtering, or sorting, then mapping it to a keyword field is more efficient. Also, consider using mapping limit settings to prevent a mapping explosion (https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-settings-limit.html). A situation where every new ingested document introduces new fields, as with dynamic mapping, can result in defining too many fields in an index. This can cause a mapping explosion. When each new field is continually added to the index mapping, it can grow excessively and lead to memory shortages and recovery challenges.

When it comes to mapping limit settings, there are several best practices to keep in mind. First, limit the number of field mappings to prevent documents from causing a mapping explosion. Second, limit the maximum depth of a field. Third, restrict the number of different nested mappings an index can have. Fourth, set a maximum for the count of nested JSON objects allowed in a single document, across all nested types. Finally, limit the maximum length of a field name. Keep in mind that setting higher limits can affect performance and cause memory problems.

For many years now, Elastic has been developing a specification called Elastic Common Schema (ECS) that provides a consistent and customizable way to structure data in Elasticsearch. Adopting this mapping has a lot of benefits (data correlation, reuse, and future-proofing, to name a few), and as a best practice, always refer to the ECS convention when you consider naming your fields. We will see more examples using ECS in the next chapters.

See also

You have been reading a chapter from
Elastic Stack 8.x Cookbook
Published in: Jun 2024
Publisher: Packt
ISBN-13: 9781837634293
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime