I introduced Valgrind in Chapter 13, Managing Memory, as a tool for identifying memory problems using the memcheck tool. Valgrind has other useful tools for application profiling. The two I am going to look at here are Callgrind and Helgrind. Since Valgrind works by running the code in a sandbox, it is able to check the code as it runs and report certain behaviors, which native tracers and profilers cannot do.
Using Valgrind
Callgrind
Callgrind is a call-graph-generating profiler that also collects information about processor cache hit rate and branch prediction. Callgrind is only useful if your bottleneck is CPU bound. It's not useful if heavy I/O or multiple processes are involved.
Valgrind does not require kernel configuration...