Ingesting General Content Data
This chapter, along with Chapter 4, will focus on data ingestion. Generally, we can categorize data into two groups – general content (data from APIs, HTML pages, catalogs, data from Relational Database Management System (RDBMS), PDFs, spreadsheets, etc.), and time series (data indexed in chronological order, such as logs, metrics, traces, and security events). In this chapter, we will ingest general content to illustrate the basic concepts of data ingestion, including fundamental data operations (index, delete, and update), analyzers, static and dynamic index mappings, and index templates.
Figure 2.1 illustrates the connections between various components, and in this chapter, we will explore recipes dedicated to the Client APP, Analyzer, Mapping, and Index template components (you can view the color image when you download the free PDF version of this book):
Figure 2.1 – Elasticsearch index management components
In this chapter, we are going to cover the following main topics:
- Adding data from the Elasticsearch client
- Updating data in Elasticsearch
- Deleting data in Elasticsearch
- Using an analyzer
- Defining index mapping
- Using dynamic templates in document mapping
- Creating an index template
- Indexing multiple documents using Bulk API