Continuous observability
Observability should not be added as an afterthought when service or feature development is over. When implementing a complex feature across several services or just adding a new external dependency, we can’t rely on users telling us when it’s broken. Tests usually don’t cover every aspect and don’t represent user behavior.
If we don’t have a reliable telemetry signal, we can’t say whether the feature works or whether customers use it.
Incorporating observability into the design process
Making sure we have telemetry in place is part of feature design work. The main questions the telemetry should answer are the following:
- Who uses this feature and how much?
- Does it work? Does it break something else?
- Does it work as expected? Does it improve things as expected?
If we can rely on the existing telemetry to answer these questions, awesome!
We should design instrumentation in a way that...