Troubleshooting problems
Troubleshooting a complex distributed system is no picnic. Abstractions, separation of concerns, information hiding, and encapsulation are great during development, testing, and when making changes to the system. But when things go wrong, you need to cross all those boundaries and layers of abstraction from the user action in their app through the entire stack, all the way to the infrastructure, thus crossing all the business logic, asynchronous processes, legacy systems, and third-party integrations. This is a challenge, even with large monolithic systems, but even more so with microservice-based distributed systems. Monitoring will assist you, but let's talk first about preparation, processes, and best practices.
Taking advantage of staging environments
When building a large system, developers work on their local machines (ignoring cloud development environments here) and eventually, the code is deployed to the production environment. However...