In-practice – an example of a postmortem
In this practice, we are going to build a postmortem from the outage walk-through from the in-practice section of Chapter 14, Rapid Response – Outage Management Techniques. We’ll start with the all-important executive statement describing the outage and then continue to the other parts of the postmortem to complete the picture, including technical descriptions and action items.
Writing the overview
These few sentences, written in the all-important executive summary style and providing the best insight into the outage, are perhaps the hardest part of any outage.
Root cause
The cause of this outage was the inability of the website to handle the increase in customers visiting the site.
Technical failure
Autoscaling limitations were not set high enough to scale out the system properly, due to a high load.
Action items
Ensuring we have the proper capacity to move forward and the ability to respond faster...