Why is Metal faster than OpenGL ES?
In late 2013, Apple announced the iPhone 5s. Built into the 5s was the A7 Processor, the first 64 bit GPU for the iOS device family. It provided a decent graphical boost compared with prior devices and reflected how GPUs in mobile devices were quickly catching up to gaming consoles released just a few years prior. OpenGL, though a staple in low-level graphics APIs, didn't squeeze the most out of the A7 chip.
Seen in the next diagram, the interaction between the CPU and GPU doesn't always perform the optimal way we'd want it to for our games.
Be it textures, shaders, or render targets, draw calls use their own state vector. The CPU via the low-level API uses much of that time verifying the state of the draw call. This process is very expensive for the CPU. What happens is that in many cycles, the GPU is sitting idle, waiting for the CPU to finish its past instruction. Here's what's taking up all of that time in the API:
State validation: Confirming API usage...