We are at the point now where we can now discuss how we can bring everything together. In our desire to increase our effectiveness in IT operations and look more holistically at application health, we now need to operationalize what we've prepared in the prior sections and configure our ML jobs accordingly. To that end, let's work through a real-life scenario in which ML helped us get to the root cause of an operational problem.
Bringing it all together for root cause analysis
Outage background
This scenario is loosely based on a real application outage, although the data was somewhat simplified and sanitized to obfuscate the original user. The problem was with a retail application that processed gift card transactions...