Vertex and fragment shaders are the most important shaders in the whole pipeline, because they expose the pure basic functionality of the GPU. With vertex shaders, you can compute the geometry of the object that you are going to render as well as other important elements, such as the scene's camera, the projection, or how the geometry is clipped. With fragment shaders, you can control how your geometry will look onscreen: colors, lighting, textures, and so on.
As you can see, with only vertex and fragment shaders, you can control almost everything in your rendering process, but there is room for more improvement in the OpenGL machine.
Let's put an example: suppose that you process point primitives with a complex vertex shader. Using those processed vertices, you can use a geometry shader to create arbitrary shaped primitives (for instance, quads) using the points as the quad's center. Then you can use those quads for a particle system.
During that process you have saved bandwidth, because you have sent points instead of quads that have four times more vertices and processing power because, once you have transformed the points, the other four vertices already lie in the same space, so you transformed one vertex with a complex shader instead of four.
Unlike vertex and fragment shaders (it is mandatory to have one of each kind to complete the pipeline) the geometry shader is only optional. So, if you do not want to create a new geometry after the vertex shader execution, simply do not link a geometry shader in your application, and the results of the vertex shader will pass unchanged to the clipping stage, which is perfectly fine.
The compute shader stage was the latest addition to the pipeline. It is also optional, like the geometry shader, and is intended for generic computations.
Inside the pipeline, some of the following shaders can exist: vertex shaders, fragment shaders, geometry shaders, tessellation shaders (meant to subdivide triangle meshes on the fly, but we are not covering them in this book), and compute shaders. OpenGL evolves every day, so don't be surprised if other shader classes appear and change the pipeline layout from time to time.
Before going deeper into the matter, there is an important concept that we have to speak about; the concept of a shader program. A shader program is nothing more than a working pipeline configuration. This means that at least a vertex shader and a fragment shader must have been compiled without errors, and linked together. As for geometry and compute shaders, they could form part of a program too, being compiled and linked together with the other two shaders into the same shader program.
In order to take your 3D model's coordinates and transform them to the clip space, we usually apply the model, view, and projection matrices to the vertices. Also, we can perform any other type of data transform, such as apply noise (from a texture or computed on the fly) to the positions for a pseudorandom displacement, calculate normals, calculate texture coordinates, calculate vertex colors, prepare the data for a normal mapping shader, and so on.
You can do a lot more with this shader; however, the most important aspect of it is to provide the vertex positions to clip coordinates, to take us to the next stage.
Tip
A vertex shader is a piece of code that is executed in the GPU processors, and it's executed once, and only once for each vertex you send to the graphics card. So, if you have a 3D model with 1000 vertices, the vertex shader will be executed 1000 times, so remember to keep your calculations always simple.
Fragment shaders are responsible for painting each primitive's area. The minimum task for a fragment shader is to output an RGBA color. You can calculate that color by any means: procedurally, from textures, or using vertex shader's output data. But in the end, you have to output at least a color to the framebuffer.
The execution model of a fragment shader is like the vertex shader's one. A fragment shader is a piece of code that is executed once, and only once, per fragment. Let us elaborate on this a bit. Suppose that you have a screen with a size of 1.024 x 768. That screen contains 786.432 pixels. Now suppose you paint one quad that covers exactly the whole screen (also known as a full screen quad). This means that your fragment shader will be executed 786.432 times, but the reality is worse. What if you paint several full screen quads (something normal when doing post-processing shaders such as motion blur, glows, or screen space ambient occlusion), or simply many triangles that overlap on the screen? Each time you paint a triangle on the screen, all its area must be rasterized, so all the triangle's fragments must be calculated. In reality, a fragment shader is executed millions of times. Optimization in a fragment shader is more critical than in the vertex shaders.
The geometry shader's stage is responsible for the creation of new rendering primitives parting from the output of the vertex shader. A geometry shader is executed once per primitive, which is, in the worst case (when it is used to emit point primitives), the same as the vertex shader. The best case scenario is when it is used to emit triangles, because only then will it be executed three times less than the vertex shader, but this complexity is relative. Although the geometry shader's execution could be cheap, it always increases the scene's complexity, and that always translates into more computational time spent by the GPU to render the scene.
This special kind of shader does not relate directly to a particular part of the pipeline. They can be written to affect vertex, fragment, or geometry shaders.
As compute shaders lie in some manner outside the pipeline, they do not have the same constraints as the other kind of shaders. This makes them ideal for generic computations. Compute shaders are less specific, but have the advantage of having access to all functions (matrix
, advanced texture
functions, and so on) and data types (vectors
, matrices
, all texture formats, and vertex buffers
) that exist in GLSL, while other GPGPU solutions, such as OpenCL or CUDA have their own specific data types and do not fit easily with the rendering pipeline.