Data incident management
When an issue is detected, the team’s productivity can be affected as resources mobilized on issue resolution cannot be used to create value with new projects. Therefore, to avoid working under pressure and troubleshooting the issue in an unsustainable way, the method we propose is as follows:
- Detect the issue.
- Evaluate its impact.
- Find the root cause.
- Troubleshoot.
- Avoid future similar issues.
Thanks to observability, each step will be supported by logs, metrics, and traces that you can use to reduce the time the team spends on resolving issues.
Let’s explore each step in more detail.
Detecting the issue
An issue can be detected by several means. Let’s say that, before you read this book, the majority of the issues are reported to you. One of your data providers or internal customers could come to you to signal that something is fishy in the data you are consuming or, even worse, that the data you...