In the previous chapter, we tested services that are interacting with each other in isolation. But when something bad happens in a real deployment, we need to have a global overview of what's going on. For example, when a microservice calls another one which in turn calls a third one, it can be hard to understand which one failed. We need to be able to track down all the interactions that a particular user had with the system that led to a problem.
Python applications can emit logs to help you debug issues, but jumping from one server to another to gather all the information you need to understand the problem can be hard. Thankfully, we can centralize all the logs to monitor a distributed deployment.
Continuously monitoring services are also important to assert the health of the whole system and follow how everything behaves. This involves answering...