Further Reading
A big source of information is the website of Brendan D Gregg (http://www.brendangregg.com), where he shares an unbelievably long list of Linux performance documentation, slides, videos, and more. On top of that, there are some nice utilities! He was the one who taught me, in 2015, that it is important to identify a problem correctly:
- What makes you think that there is a problem?
- Was there a time that there wasn't a problem?
- Has something changed recently?
- Try to find technical descriptions, such as latency, runtime errors, and so on.
- Is it only the application, or are other resources affected as well?
- Come up with an exact description of the environment.
You also have to consider the following:
- What is causing the load (which process, IP address, and so on)?
- Why was the load called?
- What resource(s) is/are used by the load?
- Does the load change? If so, how is it changing over time?
Last, but not least...