Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Machine Learning with the Elastic Stack
Machine Learning with the Elastic Stack

Machine Learning with the Elastic Stack: Gain valuable insights from your data with Elastic Stack's machine learning features , Second Edition

Arrow left icon
Profile Icon Rich Collier Profile Icon Camilla Montonen Profile Icon Bahaaldine Azarmi
Arrow right icon
€36.99
Full star icon Full star icon Full star icon Full star icon Full star icon 5 (9 Ratings)
Paperback May 2021 450 pages 2nd Edition
eBook
€20.98 €29.99
Paperback
€36.99
Subscription
Free Trial
Renews at €18.99p/m
Arrow left icon
Profile Icon Rich Collier Profile Icon Camilla Montonen Profile Icon Bahaaldine Azarmi
Arrow right icon
€36.99
Full star icon Full star icon Full star icon Full star icon Full star icon 5 (9 Ratings)
Paperback May 2021 450 pages 2nd Edition
eBook
€20.98 €29.99
Paperback
€36.99
Subscription
Free Trial
Renews at €18.99p/m
eBook
€20.98 €29.99
Paperback
€36.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Table of content icon View table of contents Preview book icon Preview Book

Machine Learning with the Elastic Stack

Chapter 1: Machine Learning for IT

A decade ago, the idea of using machine learning (ML)-based technology in IT operations or IT security seemed a little like science fiction. Today, however, it is one of the most common buzzwords used by software vendors. Clearly, there has been a major shift in both the perception of the need for the technology and the capabilities that the state-of-the-art implementations of the technology can bring to bear. This evolution is important to fully appreciate how Elastic ML came to be and what problems it was designed to solve.

This chapter is dedicated to reviewing the history and concepts behind how Elastic ML works. It also discusses the different kinds of analysis that can be done and the kinds of use cases that can be solved. Specifically, we will cover the following topics:

  • Overcoming the historical challenges in IT
  • Dealing with the plethora of data
  • The advent of automated anomaly detection
  • Unsupervised versus supervised ML
  • Using unsupervised ML for anomaly detection
  • Applying supervised ML to data frame analytics

Overcoming the historical challenges in IT

IT application support specialists and application architects have a demanding job with high expectations. Not only are they tasked with moving new and innovative projects into place for the business, but they also have to keep currently deployed applications up and running as smoothly as possible. Today's applications are significantly more complicated than ever before—they are highly componentized, distributed, and possibly virtualized/containerized. They could be developed using Agile, or by an outsourced team. Plus, they are most likely constantly changing. Some DevOps teams claim they can typically make more than 100 changes per day to a live production system. Trying to understand a modern application's health and behavior is like a mechanic trying to inspect an automobile while it is moving.

IT security operations analysts have similar struggles in keeping up with day-to-day operations, but they obviously have a different focus of keeping the enterprise secure and mitigating emerging threats. Hackers, malware, and rogue insiders have become so ubiquitous and sophisticated that the prevailing wisdom is that it is no longer a question of whether an organization will be compromised—it's more of a question of when they will find out about it. Clearly, knowing about a compromise as early as possible (before too much damage is done) is preferable to learning about it for the first time from law enforcement or the evening news.

So, how can they be helped? Is the crux of the problem that application experts and security analysts lack access to data to help them do their job effectively? Actually, in most cases, it is the exact opposite. Many IT organizations are drowning in data.

Dealing with the plethora of data

IT departments have invested in monitoring tools for decades, and it is not uncommon to have a dozen or more tools actively collecting and archiving data that can be measured in terabytes, or even petabytes, per day. The data can range from rudimentary infrastructure- and network-level data to deep diagnostic data and/or system and application log files.

Business-level key performance indicators (KPIs) could also be tracked, sometimes including data about the end user's experience. The sheer depth and breadth of data available, in some ways, is the most comprehensive than it has ever been. To detect emerging problems or threats hidden in that data, there have traditionally been several main approaches to distilling the data into informational insights:

  • Filter/search: Some tools allow the user to define searches to help trim down the data into a more manageable set. While extremely useful, this capability is most often used in an ad hoc fashion once a problem is suspected. Even then, the success of using this approach usually hinges on the ability for the user to know what they are looking for and their level of experience—both with prior knowledge of living through similar past situations and expertise in the search technology itself.
  • Visualizations: Dashboards, charts, and widgets are also extremely useful to help us understand what data has been doing and where it is trending. However, visualizations are passive and require being watched for meaningful deviations to be detected. Once the number of metrics being collected and plotted surpasses the number of eyeballs available to watch them (or even the screen real estate to display them), visual-only analysis becomes less and less useful.
  • Thresholds/rules: To get around the requirement of having data be physically watched in order for it to be proactive, many tools allow the user to define rules or conditions that get triggered upon known conditions or known dependencies between items. However, it is unlikely that you can realistically define all appropriate operating ranges or model all of the actual dependencies in today's complex and distributed applications. Plus, the amount and velocity of changes in the application or environment could quickly render any static rule set useless. Analysts find themselves chasing down many false positive alerts, setting up a boy who cried wolf paradigm that leads to resentment of the tools generating the alerts and skepticism of the value that alerting could provide.

Ultimately, there needed to be a different approach—one that wasn't necessarily a complete repudiation of past techniques, but could bring a level of automation and empirical augmentation of the evaluation of data in a meaningful way. Let's face it, humans are imperfect—we have hidden biases and limitations of capacity for remembering information and we are easily distracted and fatigued. Algorithms, if used correctly, can easily make up for these shortcomings.

The advent of automated anomaly detection

ML, while a very broad topic that encompasses everything from self-driving cars to game-winning computer programs, was a natural place to look for a solution. If you realize that most of the requirements of effective application monitoring or security threat hunting are merely variations on the theme of find me something that is different from normal, then the discipline of anomaly detection emerges as the natural place to begin using ML techniques to solve these problems for IT professionals.

The science of anomaly detection is certainly nothing new, however. Many very smart people have researched and employed a variety of algorithms and techniques for many years. However, the practical application of anomaly detection for IT data poses some interesting constraints that make the otherwise academically worthy algorithms inappropriate for the job. These include the following:

  • Timeliness: Notification of an outage, breach, or other significant anomalous situation should be known as quickly as possible to mitigate it. The cost of downtime or the risk of a continued security compromise is minimized if remedied or contained quickly. Algorithms that cannot keep up with the real-time nature of today's IT data have limited value.
  • Scalability: As mentioned earlier, the volume, velocity, and variation of IT data continue to explode in modern IT environments. Algorithms that inspect this vast data must be able to scale linearly with the data to be usable in a practical sense.
  • Efficiency: IT budgets are often highly scrutinized for wasteful spending, and many organizations are constantly being asked to do more with less. Tacking on an additional fleet of super-computers to run algorithms is not practical. Rather, modest commodity hardware with typical specifications must be able to be employed as part of the solution.
  • Generalizability: While highly specialized data science is often the best way to solve a specific information problem, the diversity of data in IT environments drives a need for something that can be broadly applicable across most use cases. Reusability of the same techniques is much more cost-effective in the long run.
  • Adaptability: Ever-changing IT environments will quickly render a brittle algorithm useless in no time. Training and retraining the ML model would only introduce yet another time-wasting venture that cannot be afforded.
  • Accuracy: We already know that alert fatigue from legacy threshold and rule-based systems is a real problem. Swapping one false alarm generator for another will not impress anyone.
  • Ease of use: Even if all of the previously mentioned constraints could be satisfied, any solution that requires an army of data scientists to implement it would be too costly and would be disqualified immediately.

So, now we are getting to the real meat of the challenge—creating a fast, scalable, accurate, low-cost anomaly detection solution that everyone will use and love because it works flawlessly. No problem!

As daunting as that sounds, Prelert founder and CTO Steve Dodson took on that challenge back in 2010. While Dodson certainly brought his academic chops to the table, the technology that would eventually become Elastic ML had its genesis in the throes of trying to solve real IT application problems—the first being a pesky intermittent outage in a trading platform at a major London finance company. Dodson, and a handful of engineers who joined the venture, helped the bank's team use the anomaly detection technology to automatically surface only the needles in the haystacks that allowed the analysts to focus on the small set of relevant metrics and log messages that were going awry. The identification of the root cause (a failing service whose recovery caused a cascade of subsequent network problems that wreaked havoc) ultimately brought stability to the application and prevented the need for the bank to spend lots of money on the previous solution, which was an unplanned, costly network upgrade.

As time passed, however, it became clear that even that initial success was only the beginning. A few years and a few thousand real-world use cases later, the marriage of Prelert and Elastic was a natural one—a combination of a platform making big data easily accessible and technology that helped overcome the limitations of human analysis.

Fast forward to 2021, a full 5 years after the joining of forces, and Elastic ML has come a long way in the maturation and expansion of capabilities of the ML platform. This second edition of the book encapsulates the updates made to Elastic ML over the years, including the introduction of integrations into several of the Elastic solutions around observability and security. This second edition also includes the introduction of "data frame analytics," which is discussed extensively in the third part of the book. In order to get a grounded, innate understanding of how Elastic ML works, we first need to get to grips with some terminology and concepts to understand things further.

Unsupervised versus supervised ML

While there are many subtypes of ML, two very prominent ones (and the two that are relevant to Elastic ML) are unsupervised and supervised.

In unsupervised ML, there is no outside guidance or direction from humans. In other words, the algorithms must learn (and model) the patterns of the data purely on their own. In general, the biggest challenge here is to have the algorithms accurately surface detected deviations of the input data's normal patterns to provide meaningful insight for the user. If the algorithm is not able to do this, then it is not useful and is unsuitable for use. Therefore, the algorithms must be quite robust and able to account for all of the intricacies of the way that the input data is likely to behave.

In supervised ML, input data (often multivariate data) is used to help model the desired outcome. The key difference from unsupervised ML is that the human decides, a priori, what variables to use as the input and also provides "ground-truth" examples of the expected target variable. Algorithms then assess how the input variables interact and influence the known output target. To accurately get the desired output (prediction, for example), the algorithm must have "the right kind of data" not only that indeed expresses the situation, but also so that there is enough diversity of the input data in order to effectively learn the relationship between the input data and the output target.

As such, both cases require good input data, good algorithmic approaches, and a good mechanism to allow the ML to both learn the behavior of the data and apply that learning to assess subsequent observations of that data. Let's dig a little deeper into the specifics of how Elastic ML leverages unsupervised and supervised learning.

Using unsupervised ML for anomaly detection

To get a more intuitive understanding of how Elastic ML's anomaly detection works using unsupervised ML, we will discuss the following:

  • A rigorous definition of unusual with respect to the technology
  • An intuitive example of learning in an unsupervised manner
  • A description of how the technology models, de-trends, and scores the data

Defining unusual

Anomaly detection is something almost all of us have a basic intuition about. Humans are quite good at pattern recognition, so it should be of no surprise that if I asked 100 people on the street what's unusual in the following graph, a vast majority (including non-technical people) would identify the spike in the green line:

Figure 1.1 – A line graph showing an anomaly

Figure 1.1 – A line graph showing an anomaly

Similarly, let's say we ask what's unusual in the following photo:

Figure 1.2 – A photograph showing a seal among penguins

Figure 1.2 – A photograph showing a seal among penguins

We will, again, likely get a majority that rightly claims that the seal is the unusual thing. But people may struggle to articulate in salient terms the actual heuristics that are used in coming to those conclusions.

There are two different heuristics that we could use to define the different kinds of anomalies shown in these images:

  • Something is unusual if its behavior has significantly deviated from an established pattern or range based upon its past history.
  • Something is unusual if some characteristic of that entity is significantly different from the same characteristic of the other members of a set or population.

These key definitions will be relevant to Elastic ML's anomaly detection, as they form the two main fundamental modes of operation of the anomaly detection algorithms (temporal versus population analysis, as will be explored in Chapter 3, Anomaly Detection). As we will see, the user will have control over what mode of operation is employed for a particular use case.

Learning what's normal

As we've stated, Elastic ML's anomaly detection uses unsupervised learning in that the learning occurs without anything being taught. There is no human assistance to shape the decisions of the learning; it simply does so on its own, via inspection of the data it is presented with. This is slightly analogous to the learning of a language via the process of immersion, as opposed to sitting down with books of vocabulary and rules of grammar.

To go from a completely naive state where nothing is known about a situation to one where predictions could be made with good certainty, a model of the situation needs to be constructed. How this model is created is extremely important, as the efficacy of all subsequent actions taken based upon this model will be highly dependent on the model's accuracy. The model will need to be flexible and continuously updated based upon new information, because that is all that it has to go on in this unsupervised paradigm.

Probability models

Probability distributions can serve this purpose quite well. There are many fundamental types of distributions (and Elastic ML uses a variety of distribution types, such as Poisson, Gaussian, log-normal, or even mixtures of models), but the Poisson distribution is a good one to discuss first, because it is appropriate in situations where there are discrete occurrences (the "counts") of things with respect to time:

Figure 1.3 – A graph demonstrating Poisson distributions (source: https://en.wikipedia.org/wiki/Poisson_distribution#/media/File:Poisson_pmf.svg)

Figure 1.3 – A graph demonstrating Poisson distributions (source: https://en.wikipedia.org/wiki/Poisson_distribution#/media/File:Poisson_pmf.svg)

There are three different variants of the distribution shown here, each with a different mean (λ) and the highest expected value of k. We can make an analogy that says that these distributions model the expected amount of postal mail that a person gets delivered to their home on a daily basis, represented by k on the x axis:

  • For λ = 1, there is about a 37% chance that zero pieces or one piece of mail are delivered daily. Perhaps this is appropriate for a college student that doesn't receive much postal mail.
  • For λ = 4, there is about a 20% chance that three or four pieces are received. This might be a good model for a young professional.
  • For λ = 10, there is about a 13% chance that 10 pieces are received per day—perhaps representing a larger family or a household that has somehow found themselves on many mailing lists!

The discrete points on each curve also give the likelihood (probability) of other values of k. As such, the model can be informative and answer questions such as "Is getting 15 pieces of mail likely?" As we can see, it is not likely for a student (λ = 1) or a young professional (λ = 4), but it is somewhat likely for a large family (λ = 10). Obviously, there was a simple declaration made here that the models shown were appropriate for the certain people described—but it should seem obvious that there needs to be a mechanism to learn that model for each individual situation, not just assert it. The process for learning it is intuitive.

Learning the models

Sticking with the postal mail analogy, it would be instinctive to realize that a method of determining what model is the best fit for a particular household could be ascertained simply by hanging out by the mailbox every day and recording what the postal carrier drops into the mailbox. It should also seem obvious that the more observations made, the higher your confidence should be that your model is accurate. In other words, only spending 3 days by the mailbox would provide less complete information and confidence than spending 30 days, or 300 for that matter.

Algorithmically, a similar process could be designed to self-select the appropriate model based upon observations. Careful scrutiny of the algorithm's choices of the model type itself (that is, Poisson, Gaussian, log-normal, and so on) and the specific coefficients of that model type (as in the preceding example of λ) would also need to be part of this self-selection process. To do this, constant evaluation of the appropriateness of the model is done. Bayesian techniques are also employed to assess the model's likely parameter values, given the dataset as a whole, but allowing for tempering of those decisions based upon how much information has been seen prior to a particular point in time. The ML algorithms accomplish this automatically.

Note

For those that want a deeper dive into some of the representative mathematics going on behind the scenes, please refer to the academic paper at http://www.ijmlc.org/papers/398-LC018.pdf.

Most importantly, the modeling that is done is continuous, so that new information is considered along with the old, with an exponential weighting given to information that is fresher. Such a model, after 60 observations, could resemble the following:

Figure 1.4 – Sample model after 60 observations

Figure 1.4 – Sample model after 60 observations

It will then seem very different after 400 observations, as the data presents itself with a slew of new observations with values between 5 and 10:

Figure 1.5 – Sample model after 400 observations

Figure 1.5 – Sample model after 400 observations

Also, notice that there is the potential for the model to have multiple modes or areas/clusters of higher probability. The complexity and trueness of the fit of the learned model (shown as the blue curve) with the theoretically ideal model (in black) matters greatly. The more accurate the model, the better representation of the state of normal for that dataset and thus, ultimately, the more accurate the prediction of how future values comport with this model.

The continuous nature of the modeling also drives the requirement that this model is capable of serialization to long-term storage, so that if model creation/analysis is paused, it can be reinstated and resumed at a later time. As we will see, the operationalization of this process of model creation, storage, and utilization is a complex orchestration, which is fortunately handled automatically by Elastic ML.

De-trending

Another important aspect of faithfully modeling real-world data is to account for prominent overtone trends and patterns that naturally occur. Does the data ebb and flow hourly and/or daily with more activity during business hours or business days? If so, this needs to be accounted for. Elastic ML automatically hunts for prominent trends in the data (linear growth, cyclical harmonics, and so on) and factors them out. Let's observe the following graph:

Figure 1.6 – Periodicity detection in action

Figure 1.6 – Periodicity detection in action

Here, the periodic daily cycle is learned, then factored out. The model's prediction boundaries (represented in the light-blue envelope around the dark-blue signal) dramatically adjust after automatically detecting three successive iterations of that cycle.

Therefore, as more data is observed over time, the models gain accuracy both from the perspective of the probability distribution function getting more mature, as well as via the auto-recognizing and de-trending of other routine patterns (such as business days, weekends, and so on) that might not emerge for days or weeks. In the following example, several trends are discovered over time, including daily, weekly, and an overall linear slope:

Figure 1.7 – Multiple trends being detected

Figure 1.7 – Multiple trends being detected

These model changes are recorded as system annotations. Annotations, as a general concept, will be covered in later chapters.

Scoring of unusualness

Once a model has been constructed, the likelihood of any future observed value can be found within the probability distribution. Earlier, we had asked the question "Is getting 15 pieces of mail likely?" This question can now be empirically answered, depending on the model, with a number between 0 (no possibility) and 1 (absolute certainty). Elastic ML will use the model to calculate this fractional value out to approximately 300 significant figures (which can be helpful when dealing with very low probabilities). Let's observe the following graph:

Figure 1.8 – Anomaly scoring

Figure 1.8 – Anomaly scoring

Here, the probability of the observation of the actual value of 921 is now calculated to be 1.444e-9 (or, more commonly, a mere 0.0000001444% chance). This very small value is perhaps not that intuitive to most people. As such, ML will take this probability calculation, and via the process of quantile normalization, re-cast that observation on a severity scale between 0 and 100, where 100 is the highest level of unusualness possible for that particular dataset. In the preceding case, the probability calculation of 1.444e-9 is normalized to a score of 94. This normalized score will come in handy later as a means by which to assess the severity of the anomaly for the purposes of alerting and/or triage.

The element of time

In Elastic ML, all of the anomaly detection that we will discuss throughout the rest of the book will have an intrinsic element of time associated with the data and analysis. In other words, for anomaly detection, Elastic ML expects the data to be time series data and that data will be analyzed in increments of time. This is a key point and also helps discriminate between anomaly detection and data frame analytics in addition to the unsupervised/supervised paradigm.

You will see that there's a slight nuance with respect to population analysis (covered in Chapter 3, Anomaly Detection) and outlier detection (covered in Chapter 10, Outlier Detection While they effectively both find entities that are distinctly different from their peers, population analysis in anomaly detection does so with respect to time, whereas outlier detection analysis isn't constrained by time. More will become obvious as these topics are covered in depth in later chapters.

Applying supervised ML to data frame analytics

With the exception of outlier detection (covered in Chapter 10, Outlier Detection which actually is an unsupervised approach, the rest of data frame analytics uses a supervised approach. Specifically, there are two main types of problems that Elastic ML data frame analytics allows you to address:

  • Regression: Used to predict a continuous numerical value (a price, a duration, a temperature, and so on)
  • Classification: Used to predict whether something is of a certain class label (fraudulent transaction versus non-fraudulent, and more)

In both cases, models are built using training data to map input variables (which can be numerical or categorical) to output predictions by training decision trees. The particular implementation used by Elastic ML is a custom variant of XGBoost, an open source gradient-boosted decision tree framework that has recently gained some notoriety among data scientists for its ability to allow them to win Kaggle competitions.

The process of supervised learning

The overall process of supervised ML is very different from the unsupervised approach. In the supervised approach, you distinctly separate the training stage from the predicting stage. A very simplified version of the process looks like the following:

Figure 1.9 – Supervised ML process

Figure 1.9 – Supervised ML process

Here, we can see that in the training phase, features are extracted out of the raw training data to create a feature matrix (also called a data frame) to feed to the ML algorithm and create the model. The model can be validated against portions of the data to see how well it did, and subsequent refinement steps could be made to adjust which features are extracted, or to refine the parameters of the ML algorithm used to improve the accuracy of the model's predictions.

Once the user decides that the model is efficacious, that model is "moved" to the prediction workflow, where it is used on new data. One at a time, a single new feature vector is inferenced against the model to form a prediction.

To get an intuitive sense of how this works, imagine a scenario in which you want to sell your house, but don't know what price to list it for. You research prior sales in your area and notice the price differentials for homes based on different factors (number of bedrooms, number of bathrooms, square footage, proximity to schools/shopping, age of home, and so on). Those factors are the "features" that are considered altogether (not individually) for every prior sale.

This corpus of historical sales is your training data. It is helpful because you know for certain how much each property sold for (and that's the thing you'd ultimately like to predict for your house). If you study this enough, you might get an intuition about how the prices of houses are driven strongly by some features (for instance, the number of bedrooms) and that other features (perhaps the age of the home) may not affect the pricing much. This is a concept called "feature importance" that will be visited again in a later chapter.

Armed with enough training data, you might have a good idea what the value of your home should be priced at, given that it is a three-bedroom, two-bath, 1,700-square-foot, 30-year-old home. In other words, you've constructed a model in your mind based on your research of comparable homes that have sold in the last year or so. If the past sales are the "training data," your home's specifications (bedrooms, bathrooms, and so on) are the feature vectors that will define the expected price, given your "model" that you've learned.

Your simple mental model is obviously not as rigorous as one that could be constructed with regression analysis using ML using dozens of relevant input features, but this simple analogy hopefully cements the idea of the process that is followed in learning from prior, known situations, and then applying that knowledge to a present, novel situation.

Summary

To summarize what we discussed in this chapter, we covered the genesis story of ML in IT—born out of the necessity to automate analysis of the massive, ever-expanding growth of collected data within enterprise environments. We also got a more intuitive understanding of the different types of ML in Elastic ML, which includes both unsupervised anomaly detection and supervised data frame analysis.

As we journey through the rest of the chapters, we will often be mapping the use cases of the problems we're trying to solve to the different modes of operation of Elastic ML.

Remember that if the data is a time series, meaning that it comes into existence routinely over time (metric/performance data, log files, transactions, and so on), it is quite possible that Elastic ML's anomaly detection is all you'll ever need. As you'll see, it is incredibly flexible and easy to use and accomplishes many use cases on a broad variety of data. It's kind of a Swiss Army knife! A large amount of this book (Chapters 3 through 8) will be devoted to how to leverage anomaly detection (and the ancillary capability of forecasting) to get the most out of your time series data that is in the Elastic Stack.

If you are more interested in finding unusual entities within a population/cohort (User/Entity Behavior), you might have a tricky decision between using population analysis in anomaly detection versus outlier detection in data frame analytics. The primary factor may be whether or not you need to do this in near real time—in which case you might likely choose population analysis. If near real time is not necessary and/or if you require the consideration of multiple features simultaneously, you would choose outlier detection. See Chapter 10, for more detailed information about the comparison and benefits of each approach.

That leaves many other use cases that require a multivariate approach to modeling. This would not only align with the previous example of real estate pricing but also encompass the use cases of language detection, customer churn analysis, malware detection, and so on. These will fall squarely in the realm of the supervised ML of data frame analytics and be covered in Chapters 11 through 13.

In the next chapter, we will get down and dirty with understanding how to enable Elastic ML and how it works in an operational sense. Buckle up and enjoy the ride!

Left arrow icon Right arrow icon

Key benefits

  • Integrate machine learning with distributed search and analytics
  • Preprocess and analyze large volumes of search data effortlessly
  • Operationalize machine learning in a scalable, production-worthy way

Description

Elastic Stack, previously known as the ELK stack, is a log analysis solution that helps users ingest, process, and analyze search data effectively. With the addition of machine learning, a key commercial feature, the Elastic Stack makes this process even more efficient. This updated second edition of Machine Learning with the Elastic Stack provides a comprehensive overview of Elastic Stack's machine learning features for both time series data analysis as well as for classification, regression, and outlier detection. The book starts by explaining machine learning concepts in an intuitive way. You'll then perform time series analysis on different types of data, such as log files, network flows, application metrics, and financial data. As you progress through the chapters, you'll deploy machine learning within Elastic Stack for logging, security, and metrics. Finally, you'll discover how data frame analysis opens up a whole new set of use cases that machine learning can help you with. By the end of this Elastic Stack book, you'll have hands-on machine learning and Elastic Stack experience, along with the knowledge you need to incorporate machine learning in your distributed search and data analysis platform.

Who is this book for?

If you’re a data professional looking to gain insights into Elasticsearch data without having to rely on a machine learning specialist or custom development, then this Elastic Stack machine learning book is for you. You'll also find this book useful if you want to integrate machine learning with your observability, security, and analytics applications. Working knowledge of the Elastic Stack is needed to get the most out of this book.

What you will learn

  • Find out how to enable the ML commercial feature in the Elastic Stack
  • Understand how Elastic machine learning is used to detect different types of anomalies and make predictions
  • Apply effective anomaly detection to IT operations, security analytics, and other use cases
  • Utilize the results of Elastic ML in custom views, dashboards, and proactive alerting
  • Train and deploy supervised machine learning models for real-time inference
  • Discover various tips and tricks to get the most out of Elastic machine learning
Estimated delivery fee Deliver to Slovenia

Premium delivery 7 - 10 business days

€25.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : May 31, 2021
Length: 450 pages
Edition : 2nd
Language : English
ISBN-13 : 9781801070034
Vendor :
Elastic
Category :
Languages :

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Estimated delivery fee Deliver to Slovenia

Premium delivery 7 - 10 business days

€25.95
(Includes tracking information)

Product Details

Publication date : May 31, 2021
Length: 450 pages
Edition : 2nd
Language : English
ISBN-13 : 9781801070034
Vendor :
Elastic
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 120.97
Machine Learning with the Elastic Stack
€36.99
Threat Hunting with Elastic Stack
€41.99
Getting Started with Elastic Stack 8.0
€41.99
Total 120.97 Stars icon

Table of Contents

17 Chapters
Section 1 – Getting Started with Machine Learning with Elastic Stack Chevron down icon Chevron up icon
Chapter 1: Machine Learning for IT Chevron down icon Chevron up icon
Chapter 2: Enabling and Operationalization Chevron down icon Chevron up icon
Section 2 – Time Series Analysis – Anomaly Detection and Forecasting Chevron down icon Chevron up icon
Chapter 3: Anomaly Detection Chevron down icon Chevron up icon
Chapter 4: Forecasting Chevron down icon Chevron up icon
Chapter 5: Interpreting Results Chevron down icon Chevron up icon
Chapter 6: Alerting on ML Analysis Chevron down icon Chevron up icon
Chapter 7: AIOps and Root Cause Analysis Chevron down icon Chevron up icon
Chapter 8: Anomaly Detection in Other Elastic Stack Apps Chevron down icon Chevron up icon
Section 3 – Data Frame Analysis Chevron down icon Chevron up icon
Chapter 9: Introducing Data Frame Analytics Chevron down icon Chevron up icon
Chapter 10: Outlier Detection Chevron down icon Chevron up icon
Chapter 11: Classification Analysis Chevron down icon Chevron up icon
Chapter 12: Regression Chevron down icon Chevron up icon
Chapter 13: Inference Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Full star icon 5
(9 Ratings)
5 star 100%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




N/A Feb 06, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Achat facile, livraison très rapide, paiement sécurisé via paypal, parfait
Feefo Verified review Feefo
Peter Titov Jul 20, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
While I am by no means a machine learning expert; this book has provided exceptionally thoughtful insights and a methodical approach to applying Machine Learning within the Elastic Stack to set myself up for success (and you can too) with the practical applications of ML across a variety of data sources.I would highly recommend this book!
Amazon Verified review Amazon
A. Norris Aug 24, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This product is an asset in the library. It is valuable as a lookup resource for both new and professional users of Elastic machine learning. The second edition adds more information about the latest revisions in the product and this is useful if you want to employ the most effective solutions.
Amazon Verified review Amazon
cindy gleason Jun 11, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I am enjoying and learning a lot from this book. It does a great job giving valuable insights on how to use Elastic search's machine learning capabilities.
Amazon Verified review Amazon
Amruta Ghate Aug 25, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Rich's book goes into intricate detail on all the features and functions of Elastic's ML capabilities. If you want to learn not only how to configure jobs but understand the underlying models and settings this is the book for you. I use Elastic daily and learned a lot by reading it.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela