Turning observability inside out
I remember when a product owner asked me, many years ago, why her product had periodic performance issues. I didn’t have a good answer because in those days it was typical to only monitor the infrastructure, with details limited to the likes of CPU and memory utilization and overall transaction volumes and latency. I could confirm that there was a periodic issue, but there was very little contextual information available to identify the root cause.
The system was also a monolith, which made it difficult to observe one transaction type from another. So, I went about instrumenting the code. I used aspect-orient programming techniques to inject code between the various layers to collect and tag detailed metrics. It took a fair amount of elbow grease, but eventually, I had the information that I needed to find the root cause of various problems and improve the performance of the system.
Today, serverless turns this problem inside out. Each...