There are two types of profilers available: instrumentation profilers and sampling ones. One of the better-known instrumentation profilers is Callgrind, part of the Valgrind suite. Instrumentation profilers have lots of overhead because they need to, well, instrument your code to see what functions you call and how much each of them takes. This way, the results they produce contain even the smallest functions, but the execution times can be skewed by this overhead. It also has the drawback of not always catching input/output (I/O) slowness and jitters. They slow down the execution, so while they can tell you how often you call a particular function, they won't tell you if the slowness is due to waiting on a disk read to finish.
Due to the flaws of instrumentation profilers, it's usually better to use sampling profilers instead. Two worth mentioning are the open source perf for profiling on Linux systems and Intel's proprietary tool called...