When looking at the entire system, a good place to start is with a simple tool such as top, which gives you an overview very quickly. It shows you how much memory is being used, which processes are eating CPU cycles, and how this is spread across different cores and time.
If top shows that a single application is using up all the CPU cycles in user space, then you can profile that application using perf.
If two or more processes have a high CPU usage, there is probably something that is coupling them together, perhaps data communication. If a lot of cycles are spent in system calls or handling interrupts, then there may be an issue with the kernel configuration or with a device driver. In either case, you need to start by taking a profile of the whole system, again using perf.
If you want to find out more about the kernel and the sequencing of events there...