Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Advanced Elasticsearch 7.0

You're reading from   Advanced Elasticsearch 7.0 A practical guide to designing, indexing, and querying advanced distributed search engines

Arrow left icon
Product type Paperback
Published in Aug 2019
Publisher Packt
ISBN-13 9781789957754
Length 560 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Wai Tak Wong Wai Tak Wong
Author Profile Icon Wai Tak Wong
Wai Tak Wong
Arrow right icon
View More author details
Toc

Table of Contents (25) Chapters Close

Preface 1. Section 1: Fundamentals and Core APIs FREE CHAPTER
2. Overview of Elasticsearch 7 3. Index APIs 4. Document APIs 5. Mapping APIs 6. Anatomy of an Analyzer 7. Search APIs 8. Section 2: Data Modeling, Aggregations Framework, Pipeline, and Data Analytics
9. Modeling Your Data in the Real World 10. Aggregation Frameworks 11. Preprocessing Documents in Ingest Pipelines 12. Using Elasticsearch for Exploratory Data Analysis 13. Section 3: Programming with the Elasticsearch Client
14. Elasticsearch from Java Programming 15. Elasticsearch from Python Programming 16. Section 4: Elastic Stack
17. Using Kibana, Logstash, and Beats 18. Working with Elasticsearch SQL 19. Working with Elasticsearch Analysis Plugins 20. Section 5: Advanced Features
21. Machine Learning with Elasticsearch 22. Spark and Elasticsearch for Real-Time Analytics 23. Building Analytics RESTful Services 24. Other Books You May Enjoy

Building a Java Spark ML module for k-means anomaly detection

According to the Spark MLlib guide (see https://spark.apache.org/docs/latest/ml-guide.html), starting from Spark 2.0, the RDD-based APIs in the spark.mllib package will be retired. Users should use the DataFrame-based ML API in the spark.ml package. In this project, we import several classes from this new library to build the anomaly detection model. The following code block shows a few lines from the AnomalyDetection class in the com.example.esanalytics.spark.mllib package:

import org.apache.spark.ml.clustering.KMeansModel;
import org.apache.spark.ml.feature.VectorAssembler;
import org.apache.spark.ml.clustering.KMeans;

The following diagram helps us to learn about the steps to build the model within the scope of ES-Hadoop, Spark SQL, and Spark MLlib:

The step-by-step instructions are as follows:

  1. There are two major...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime