Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Advanced Elasticsearch 7.0 A practical guide to designing, indexing, and querying advanced distributed search engines

Product type Paperback

Published in Aug 2019

Publisher Packt

ISBN-13 9781789957754

Length 560 pages

Edition 1st Edition

Languages

Java

Tools

Elasticsearch

Concepts

Enterprise Search

Author (1):

Wai Tak Wong

View More author details

Table of Contents (25) Chapters

Preface

1. Section 1: Fundamentals and Core APIs FREE CHAPTER

2. Overview of Elasticsearch 7

3. Index APIs

4. Document APIs

5. Mapping APIs

6. Anatomy of an Analyzer

7. Search APIs

8. Section 2: Data Modeling, Aggregations Framework, Pipeline, and Data Analytics

9. Modeling Your Data in the Real World

10. Aggregation Frameworks

11. Preprocessing Documents in Ingest Pipelines

12. Using Elasticsearch for Exploratory Data Analysis

13. Section 3: Programming with the Elasticsearch Client

14. Elasticsearch from Java Programming

15. Elasticsearch from Python Programming

16. Section 4: Elastic Stack

17. Using Kibana, Logstash, and Beats

18. Working with Elasticsearch SQL

19. Working with Elasticsearch Analysis Plugins

20. Section 5: Advanced Features

21. Machine Learning with Elasticsearch

22. Spark and Elasticsearch for Real-Time Analytics

23. Building Analytics RESTful Services

24. Other Books You May Enjoy

Leave a review - let other readers know what you think

An analyzer's components

The purpose of an analyzer is to generate terms from a document and to create inverted indexes (such as lists of unique words and the document IDs they appear in, or a list of word frequencies). An analyzer must have only one tokenizer and, optionally, as many character filters and token filters as the user wants. Whether it is a built-in analyzer or a custom analyzer, analyzers are just an aggregation of the processes of these three building blocks, as illustrated in the following diagram:

Recall from Chapter 1, Overview of Elasticsearch 7, (you can refer to the Analyzer section) that a standard analyzer is composed of a standard tokenizer and a lowercase token filter. A standard tokenizer provides grammar-based tokenization, while a lowercase token filter normalizes tokens to lowercase. Let's suppose that the input string is an HTML text string...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Wong

Wai Tak Wong is a faculty member in the Department of Computer Science at Kean University, NJ, USA. He has more than 15 years professional experience in cloud software design and development. His PhD in computer science was obtained at NJIT, NJ, USA. Wai Tak has served as an associate professor in the Information Management Department of Chung Hua University, Taiwan. A co-founder of Shanghai Shellshellfish Information Technology, Wai Tak acted as the Chief Scientist of the R&D team, and he has published more than a dozen algorithms in prestigious journals and conferences. Wai Tak began his search and analytics technology career with Elasticsearch in the real estate market and later applied this to data management and FinTech data services.

See other products by Wong