Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
OpenGL ??? Build high performance graphics

You're reading from   OpenGL ??? Build high performance graphics Assimilate the ideas shared in the course to utilize the power of OpenGL to perform a wide variety of tasks.

Arrow left icon
Product type Course
Published in May 2017
Publisher Packt
ISBN-13 9781788296724
Length 982 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (3):
Arrow left icon
Muhammad Mobeen Movania Muhammad Mobeen Movania
Author Profile Icon Muhammad Mobeen Movania
Muhammad Mobeen Movania
Raymond Chun Hing Lo Raymond Chun Hing Lo
Author Profile Icon Raymond Chun Hing Lo
Raymond Chun Hing Lo
William Lo William Lo
Author Profile Icon William Lo
William Lo
Arrow right icon
View More author details
Toc

Chapter 6. GPU-based Alpha Blending and Global Illumination

In this chapter, we will focus on:

  • Implementing order-independent transparency using front-to-back peeling
  • Implementing order-independent transparency with dual depth peeling
  • Implementing screen space ambient occlusion (SSAO)
  • Implementing global illumination using spherical harmonics lighting
  • Implementing GPU-based ray tracing
  • Implementing GPU-based path tracing

Introduction

Even with the introduction of lighting, our virtual objects don't look and feel real. This is because our lights are a simple approximation of the reflection behavior of the surface. There is a specific category of algorithms that help bridge the gap between the real-world lighting and the virtual-world lighting. These are called global illumination methods. Although these methods had been proven to be expensive to evaluate in real time, new methods have been proposed that fake the global illumination using clever techniques. One such technique is spherical harmonics lighting that uses HDR light probes to light a virtual scene having no light source. The idea is to extract the lighting information from the light probe and give a feeling that the virtual objects are in the same environment.

In addition, rendering of transparent geometry is also problematic since this requires sorting of geometry in the depth order. If the scene complexity increases, it becomes not only difficult to maintain the depth order, but the processing overhead also increases. To circumvent these scenarios and handle the alpha blending for order-independent transparency of the 3D geometry efficiently, we implement depth peeling and the more efficient dual depth peeling, on the modern GPU. All of these techniques will be implemented in OpenGL 3.3 core profile.

Implementing order-independent transparency using front-to-back peeling

When we have to render translucent geometry, for example, a glass window in a graphics application, care has to be taken to make sure that the geometry is properly rendered in the depth order such that the opaque objects in the scene are rendered first and the transparent objects are rendered last. This unfortunately incurs additional overhead where the CPU is busy sorting objects. In addition, the blending result will be correct only from a specific viewing direction, as shown in the following figure. Note that the image on the left is the result if we view from the direction of the Z axis. There is no blending at all in the left image. If the same scene is viewed from the opposite side, we can see the correct alpha blending result.

Depth peeling (also called front-to-back peeling) is one technique that helps in this process. In this technique, the scene is rendered in slices in such a way that slices are rendered one after another from front to back until the whole object is processed, as shown in the following figure, which is a 2D side view of the same scene as in the previous figure.

The number of layers to use for peeling is dependent on the depth complexity of the scene. This recipe will show how to implement this technique in modern OpenGL.

Getting ready

The code for this recipe is contained in the Chapter6/FrontToBackPeeling directory.

How to do it…

Let us start our recipe by following these simple steps:

  1. Set up two frame buffer objects (FBOs) with two color and depth attachments. For this recipe, we will use rectangle textures (GL_TEXTURE_RECTANGLE) since they enable easier handling of images (samplers) in the fragment shader. With rectangle textures we can access texture values using pixel positions directly. In case of normal texture (GL_TEXTUR_2D), we have to normalize the texture coordinates.
    glGenFramebuffers(2, fbo); 
      glGenTextures (2, texID);
      glGenTextures (2, depthTexID);
      for(int i=0;i<2;i++) {
        glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[i]);
        //set texture parameters like minification etc. 
        glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_DEPTH_COMPONENT32F, WIDTH, HEIGHT, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);
        glBindTexture(GL_TEXTURE_RECTANGLE,texID[i]);
        //set texture parameters like minification etc. glTexImage2D(GL_TEXTURE_RECTANGLE , 0,GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL);
        glBindFramebuffer(GL_FRAMEBUFFER, fbo[i]);
        glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_RECTANGLE, depthTexID[i], 0);
        glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_RECTANGLE,texID[i], 0);
      }
      glGenTextures(1, &colorBlenderTexID);
      glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
      //set texture parameters like minification etc.
      glTexImage2D(GL_TEXTURE_RECTANGLE, 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, 0);
  2. Set another FBO for color blending and check the FBO for completeness. The color blending FBO uses the depth texture from the first FBO as a depth attachment, as it uses the depth output from the first step during blending.
      glGenFramebuffers(1, &colorBlenderFBOID);
      glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_RECTANGLE, depthTexID[0], 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_RECTANGLE, colorBlenderTexID, 0);
      GLenum status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
      if(status == GL_FRAMEBUFFER_COMPLETE )
        printf("FBO setup successful !!! \n");
      else
        printf("Problem with FBO setup");
    
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
  3. In the rendering function, set the color blending FBO as the current render target and then render the scene normally with depth testing enabled.
    glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);
    glDrawBuffer(GL_COLOR_ATTACHMENT0);
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );
    glEnable(GL_DEPTH_TEST);
    DrawScene(MVP, cubeShader);
  4. Next, bind the other FBO pair alternatively, clear the render target, and enable depth testing, but disable alpha blending. This is to render the nearest surface in the offscreen render target. The number of passes dictate the number of layers the given geometry is peeled into. The more the number of passes, the more continuous the depth peeling result. For the demo in this recipe, the number of passes is set as 6. The number of passes is dependent on the depth complexity of the scene. If the user wants to check the number of samples output from the depth peeling step, then based on the value of the flag (bUseOQ) an occlusion query is used to find the number of samples output from the depth peeling step.
      int numLayers = (NUM_PASSES - 1) * 2;
      for (int layer = 1; bUseOQ || layer < numLayers; layer++) {
        int currId = layer % 2;
        int prevId = 1 - currId;
        glBindFramebuffer(GL_FRAMEBUFFER, fbo[currId]);
        glDrawBuffer(GL_COLOR_ATTACHMENT0);
        glClearColor(0, 0, 0, 0);
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
        glDisable(GL_BLEND);
        glEnable(GL_DEPTH_TEST);
    if (bUseOQ) {
          glBeginQuery(GL_SAMPLES_PASSED_ARB, queryId);
    }
  5. Bind the depth texture from the first step so that the nearest fragment can be used with the attached shaders and then render the scene with the front peeling shaders. Refer to Chapter6/FrontToBackPeeling/shaders/front_peel.{vert,frag} for details. We then end the hardware query if the query was initiated.
      glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[prevId]);
      DrawScene(MVP, frontPeelShader);
      if (bUseOQ) {
        glEndQuery(GL_SAMPLES_PASSED_ARB);
      }
  6. Bind the color blender FBO again, disable depth testing, and enable additive blending; however, specify separate blending so that the color and alpha can be blended separately. Finally, bind the rendered output from step 5 and then using a full-screen quad and the blend shader (Chapter6/FrontToBackPeeling/shaders/blend. {vert,frag}), blend the whole scene.
      glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);
      glDrawBuffer(GL_COLOR_ATTACHMENT0);
      glDisable(GL_DEPTH_TEST);
      glEnable(GL_BLEND);
      glBlendEquation(GL_FUNC_ADD);
      glBlendFuncSeparate(GL_DST_ALPHA, GL_ONE,GL_ZERO, GL_ONE_MINUS_SRC_ALPHA);
      glBindTexture(GL_TEXTURE_RECTANGLE, texID[currId]);
      blendShader.Use();
        DrawFullScreenQuad();
      blendShader.UnUse();
      glDisable(GL_BLEND);
  7. In the final step, restore the default draw buffer (GL_BACK_LEFT) and disable alpha blending and depth testing. Use a full-screen quad and a final shader (Chapter6/FrontToBackPeeling/shaders/final.frag) to blend the output from the color blending FBO.
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
      glDrawBuffer(GL_BACK_LEFT);
      glDisable(GL_DEPTH_TEST);	
      glDisable(GL_BLEND);
    
      glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
      finalShader.Use();
        glUniform4fv(finalShader("vBackgroundColor"), 1, &bg.x);
        DrawFullScreenQuad();
      finalShader.UnUse();

How it works…

The front-to-back depth peeling works in three steps. First, the scene is rendered normally on a depth FBO with depth testing enabled. This ensures that the scene depth values are stored in the depth attachment of the FBO. In the second pass, we bind the depth FBO, bind the depth texture from the first step, and then iteratively clip parts of the geometry by using a fragment shader (see Chapter6/FrontToBackPeeling/shaders/front_peel.frag) as shown in the following code snippet:

#version 330 core
layout(location = 0) out vec4 vFragColor;
uniform vec4 vColor;
uniform sampler2DRect  depthTexture;
void main() {
  float frontDepth = texture(depthTexture, gl_FragCoord.xy).r;
  if(gl_FragCoord.z <= frontDepth)
    discard;
  vFragColor = vColor;
}

This shader simply compares the incoming fragment's depth against the depth value stored in the depth texture. If the current fragment's depth is less than or equal to the depth in the depth texture, the fragment is discarded. Otherwise, the fragment color is output.

float frontDepth = texture(depthTexture, gl_FragCoord.xy).r;
if(gl_FragCoord.z <= frontDepth)
  discard;

After this step, we bind the color blend FBO, disable depth test, and then enable alpha blending with separate blending of colors and alpha values. The glBlendFunctionSeparate function is used here as it enables us to handle color and alpha channels for source and destination separately. The first parameter is the source RGB, which is assigned the alpha value of the pixel in the frame buffer. This blends the incoming fragment with the existing color in the frame buffer. The second parameter, that is, the destination RGB, is set as GL_ONE, which keeps the value in the destination as is. The third parameter is set as GL_ZERO, which removes the source alpha component as we already applied the alpha from the destination using the first parameter. The final parameter, that is, the destination alpha is set as the conventional over-compositing alpha value (GL_ONE_MINUS_SRC_ALPHA).

We then bind the texture from the previous step output and then use the blend shader (see Chapter6/FrontToBackPeeling/shaders/blend.frag) on a full-screen quad to alpha blend the current fragments with the existing fragments on the frame buffer. The blend shader is defined as follows:

#version 330 core
uniform sampler2DRect tempTexture; 
layout(location = 0) out vec4 vFragColor;
void main() {
  vFragColor = texture(tempTexture, gl_FragCoord.xy);
}

The tempTexture sampler contains the output from the depth peeling step stored in the colorBlenderFBO attachment. After this step, the alpha blending is disabled, as shown in the code snippet in step 6 of the How to do it... section.

In the final step, the default draw buffer is restored, depth testing and alpha blending is disabled, and the final output from the color blend FBO is blended with the background color using a simple fragment shader. The code snippet is as shown in step 7 of the How to do it... section. The final fragment shader is defined as follows:

#version 330 core
uniform sampler2DRect colorTexture;
uniform vec4 vBackgroundColor;
layout(location = 0) out vec4 vFragColor;
void main() {
  vec4 color = texture(colorTexture, gl_FragCoord.xy);
  vFragColor = color + vBackgroundColor*color.a;
}

The final shader takes the front peeled result and blends it with the background color using the alpha value from the front peeled result. This way rather than taking the nearest depth fragment all fragments are taken into consideration showing a correctly blended result.

There's more…

The output from the demo application for this recipe renders 27 translucent cubes at the origin. The camera position can be changed using the left mouse button. The front-to-back depth peeling gives the following output. Note the blended color, for example, the yellow color where the green boxes overlay the red ones.

Pressing the Space bar disables front-to-back peeling so that we can see the normal alpha blending without back-to-front sorting which gives the following output. Note that we do not see the yellow blended color where the green and red boxes overlap.

Even though the output produced by front-to-back peeling is correct, it requires multiple passes through the geometry that incur additional processing overhead. The next recipe details the more robust method called dual depth peeling which tackles this problem.

See also

Getting ready

The code for this recipe is contained in the Chapter6/FrontToBackPeeling directory.

How to do it…

Let us start our recipe by following these simple steps:

  1. Set up two frame buffer objects (FBOs) with two color and depth attachments. For this recipe, we will use rectangle textures (GL_TEXTURE_RECTANGLE) since they enable easier handling of images (samplers) in the fragment shader. With rectangle textures we can access texture values using pixel positions directly. In case of normal texture (GL_TEXTUR_2D), we have to normalize the texture coordinates.
    glGenFramebuffers(2, fbo); 
      glGenTextures (2, texID);
      glGenTextures (2, depthTexID);
      for(int i=0;i<2;i++) {
        glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[i]);
        //set texture parameters like minification etc. 
        glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_DEPTH_COMPONENT32F, WIDTH, HEIGHT, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);
        glBindTexture(GL_TEXTURE_RECTANGLE,texID[i]);
        //set texture parameters like minification etc. glTexImage2D(GL_TEXTURE_RECTANGLE , 0,GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL);
        glBindFramebuffer(GL_FRAMEBUFFER, fbo[i]);
        glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_RECTANGLE, depthTexID[i], 0);
        glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_RECTANGLE,texID[i], 0);
      }
      glGenTextures(1, &colorBlenderTexID);
      glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
      //set texture parameters like minification etc.
      glTexImage2D(GL_TEXTURE_RECTANGLE, 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, 0);
  2. Set another FBO for color blending and check the FBO for completeness. The color blending FBO uses the depth texture from the first FBO as a depth attachment, as it uses the depth output from the first step during blending.
      glGenFramebuffers(1, &colorBlenderFBOID);
      glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_RECTANGLE, depthTexID[0], 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_RECTANGLE, colorBlenderTexID, 0);
      GLenum status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
      if(status == GL_FRAMEBUFFER_COMPLETE )
        printf("FBO setup successful !!! \n");
      else
        printf("Problem with FBO setup");
    
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
  3. In the rendering function, set the color blending FBO as the current render target and then render the scene normally with depth testing enabled.
    glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);
    glDrawBuffer(GL_COLOR_ATTACHMENT0);
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );
    glEnable(GL_DEPTH_TEST);
    DrawScene(MVP, cubeShader);
  4. Next, bind the other FBO pair alternatively, clear the render target, and enable depth testing, but disable alpha blending. This is to render the nearest surface in the offscreen render target. The number of passes dictate the number of layers the given geometry is peeled into. The more the number of passes, the more continuous the depth peeling result. For the demo in this recipe, the number of passes is set as 6. The number of passes is dependent on the depth complexity of the scene. If the user wants to check the number of samples output from the depth peeling step, then based on the value of the flag (bUseOQ) an occlusion query is used to find the number of samples output from the depth peeling step.
      int numLayers = (NUM_PASSES - 1) * 2;
      for (int layer = 1; bUseOQ || layer < numLayers; layer++) {
        int currId = layer % 2;
        int prevId = 1 - currId;
        glBindFramebuffer(GL_FRAMEBUFFER, fbo[currId]);
        glDrawBuffer(GL_COLOR_ATTACHMENT0);
        glClearColor(0, 0, 0, 0);
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
        glDisable(GL_BLEND);
        glEnable(GL_DEPTH_TEST);
    if (bUseOQ) {
          glBeginQuery(GL_SAMPLES_PASSED_ARB, queryId);
    }
  5. Bind the depth texture from the first step so that the nearest fragment can be used with the attached shaders and then render the scene with the front peeling shaders. Refer to Chapter6/FrontToBackPeeling/shaders/front_peel.{vert,frag} for details. We then end the hardware query if the query was initiated.
      glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[prevId]);
      DrawScene(MVP, frontPeelShader);
      if (bUseOQ) {
        glEndQuery(GL_SAMPLES_PASSED_ARB);
      }
  6. Bind the color blender FBO again, disable depth testing, and enable additive blending; however, specify separate blending so that the color and alpha can be blended separately. Finally, bind the rendered output from step 5 and then using a full-screen quad and the blend shader (Chapter6/FrontToBackPeeling/shaders/blend. {vert,frag}), blend the whole scene.
      glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);
      glDrawBuffer(GL_COLOR_ATTACHMENT0);
      glDisable(GL_DEPTH_TEST);
      glEnable(GL_BLEND);
      glBlendEquation(GL_FUNC_ADD);
      glBlendFuncSeparate(GL_DST_ALPHA, GL_ONE,GL_ZERO, GL_ONE_MINUS_SRC_ALPHA);
      glBindTexture(GL_TEXTURE_RECTANGLE, texID[currId]);
      blendShader.Use();
        DrawFullScreenQuad();
      blendShader.UnUse();
      glDisable(GL_BLEND);
  7. In the final step, restore the default draw buffer (GL_BACK_LEFT) and disable alpha blending and depth testing. Use a full-screen quad and a final shader (Chapter6/FrontToBackPeeling/shaders/final.frag) to blend the output from the color blending FBO.
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
      glDrawBuffer(GL_BACK_LEFT);
      glDisable(GL_DEPTH_TEST);	
      glDisable(GL_BLEND);
    
      glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
      finalShader.Use();
        glUniform4fv(finalShader("vBackgroundColor"), 1, &bg.x);
        DrawFullScreenQuad();
      finalShader.UnUse();

How it works…

The front-to-back depth peeling works in three steps. First, the scene is rendered normally on a depth FBO with depth testing enabled. This ensures that the scene depth values are stored in the depth attachment of the FBO. In the second pass, we bind the depth FBO, bind the depth texture from the first step, and then iteratively clip parts of the geometry by using a fragment shader (see Chapter6/FrontToBackPeeling/shaders/front_peel.frag) as shown in the following code snippet:

#version 330 core
layout(location = 0) out vec4 vFragColor;
uniform vec4 vColor;
uniform sampler2DRect  depthTexture;
void main() {
  float frontDepth = texture(depthTexture, gl_FragCoord.xy).r;
  if(gl_FragCoord.z <= frontDepth)
    discard;
  vFragColor = vColor;
}

This shader simply compares the incoming fragment's depth against the depth value stored in the depth texture. If the current fragment's depth is less than or equal to the depth in the depth texture, the fragment is discarded. Otherwise, the fragment color is output.

float frontDepth = texture(depthTexture, gl_FragCoord.xy).r;
if(gl_FragCoord.z <= frontDepth)
  discard;

After this step, we bind the color blend FBO, disable depth test, and then enable alpha blending with separate blending of colors and alpha values. The glBlendFunctionSeparate function is used here as it enables us to handle color and alpha channels for source and destination separately. The first parameter is the source RGB, which is assigned the alpha value of the pixel in the frame buffer. This blends the incoming fragment with the existing color in the frame buffer. The second parameter, that is, the destination RGB, is set as GL_ONE, which keeps the value in the destination as is. The third parameter is set as GL_ZERO, which removes the source alpha component as we already applied the alpha from the destination using the first parameter. The final parameter, that is, the destination alpha is set as the conventional over-compositing alpha value (GL_ONE_MINUS_SRC_ALPHA).

We then bind the texture from the previous step output and then use the blend shader (see Chapter6/FrontToBackPeeling/shaders/blend.frag) on a full-screen quad to alpha blend the current fragments with the existing fragments on the frame buffer. The blend shader is defined as follows:

#version 330 core
uniform sampler2DRect tempTexture; 
layout(location = 0) out vec4 vFragColor;
void main() {
  vFragColor = texture(tempTexture, gl_FragCoord.xy);
}

The tempTexture sampler contains the output from the depth peeling step stored in the colorBlenderFBO attachment. After this step, the alpha blending is disabled, as shown in the code snippet in step 6 of the How to do it... section.

In the final step, the default draw buffer is restored, depth testing and alpha blending is disabled, and the final output from the color blend FBO is blended with the background color using a simple fragment shader. The code snippet is as shown in step 7 of the How to do it... section. The final fragment shader is defined as follows:

#version 330 core
uniform sampler2DRect colorTexture;
uniform vec4 vBackgroundColor;
layout(location = 0) out vec4 vFragColor;
void main() {
  vec4 color = texture(colorTexture, gl_FragCoord.xy);
  vFragColor = color + vBackgroundColor*color.a;
}

The final shader takes the front peeled result and blends it with the background color using the alpha value from the front peeled result. This way rather than taking the nearest depth fragment all fragments are taken into consideration showing a correctly blended result.

There's more…

The output from the demo application for this recipe renders 27 translucent cubes at the origin. The camera position can be changed using the left mouse button. The front-to-back depth peeling gives the following output. Note the blended color, for example, the yellow color where the green boxes overlay the red ones.

Pressing the Space bar disables front-to-back peeling so that we can see the normal alpha blending without back-to-front sorting which gives the following output. Note that we do not see the yellow blended color where the green and red boxes overlap.

Even though the output produced by front-to-back peeling is correct, it requires multiple passes through the geometry that incur additional processing overhead. The next recipe details the more robust method called dual depth peeling which tackles this problem.

See also

How to do it…

Let us start our recipe by following these simple steps:

Set up two frame buffer objects (FBOs) with two color and depth attachments. For this recipe, we will use rectangle textures (GL_TEXTURE_RECTANGLE) since they enable easier handling of images (samplers) in the fragment shader. With rectangle textures we can access texture values using pixel positions directly. In case of normal texture (GL_TEXTUR_2D), we have to normalize the texture coordinates.
glGenFramebuffers(2, fbo); 
  glGenTextures (2, texID);
  glGenTextures (2, depthTexID);
  for(int i=0;i<2;i++) {
    glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[i]);
    //set texture parameters like minification etc. 
    glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_DEPTH_COMPONENT32F, WIDTH, HEIGHT, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);
    glBindTexture(GL_TEXTURE_RECTANGLE,texID[i]);
    //set texture parameters like minification etc. glTexImage2D(GL_TEXTURE_RECTANGLE , 0,GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL);
    glBindFramebuffer(GL_FRAMEBUFFER, fbo[i]);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_RECTANGLE, depthTexID[i], 0);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_RECTANGLE,texID[i], 0);
  }
  glGenTextures(1, &colorBlenderTexID);
  glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
  //set texture parameters like minification etc.
  glTexImage2D(GL_TEXTURE_RECTANGLE, 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, 0);
Set another
  1. FBO for color blending and check the FBO for completeness. The color blending FBO uses the depth texture from the first FBO as a depth attachment, as it uses the depth output from the first step during blending.
      glGenFramebuffers(1, &colorBlenderFBOID);
      glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_RECTANGLE, depthTexID[0], 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_RECTANGLE, colorBlenderTexID, 0);
      GLenum status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
      if(status == GL_FRAMEBUFFER_COMPLETE )
        printf("FBO setup successful !!! \n");
      else
        printf("Problem with FBO setup");
    
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
  2. In the rendering function, set the color blending FBO as the current render target and then render the scene normally with depth testing enabled.
    glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);
    glDrawBuffer(GL_COLOR_ATTACHMENT0);
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );
    glEnable(GL_DEPTH_TEST);
    DrawScene(MVP, cubeShader);
  3. Next, bind the other FBO pair alternatively, clear the render target, and enable depth testing, but disable alpha blending. This is to render the nearest surface in the offscreen render target. The number of passes dictate the number of layers the given geometry is peeled into. The more the number of passes, the more continuous the depth peeling result. For the demo in this recipe, the number of passes is set as 6. The number of passes is dependent on the depth complexity of the scene. If the user wants to check the number of samples output from the depth peeling step, then based on the value of the flag (bUseOQ) an occlusion query is used to find the number of samples output from the depth peeling step.
      int numLayers = (NUM_PASSES - 1) * 2;
      for (int layer = 1; bUseOQ || layer < numLayers; layer++) {
        int currId = layer % 2;
        int prevId = 1 - currId;
        glBindFramebuffer(GL_FRAMEBUFFER, fbo[currId]);
        glDrawBuffer(GL_COLOR_ATTACHMENT0);
        glClearColor(0, 0, 0, 0);
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
        glDisable(GL_BLEND);
        glEnable(GL_DEPTH_TEST);
    if (bUseOQ) {
          glBeginQuery(GL_SAMPLES_PASSED_ARB, queryId);
    }
  4. Bind the depth texture from the first step so that the nearest fragment can be used with the attached shaders and then render the scene with the front peeling shaders. Refer to Chapter6/FrontToBackPeeling/shaders/front_peel.{vert,frag} for details. We then end the hardware query if the query was initiated.
      glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[prevId]);
      DrawScene(MVP, frontPeelShader);
      if (bUseOQ) {
        glEndQuery(GL_SAMPLES_PASSED_ARB);
      }
  5. Bind the color blender FBO again, disable depth testing, and enable additive blending; however, specify separate blending so that the color and alpha can be blended separately. Finally, bind the rendered output from step 5 and then using a full-screen quad and the blend shader (Chapter6/FrontToBackPeeling/shaders/blend. {vert,frag}), blend the whole scene.
      glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);
      glDrawBuffer(GL_COLOR_ATTACHMENT0);
      glDisable(GL_DEPTH_TEST);
      glEnable(GL_BLEND);
      glBlendEquation(GL_FUNC_ADD);
      glBlendFuncSeparate(GL_DST_ALPHA, GL_ONE,GL_ZERO, GL_ONE_MINUS_SRC_ALPHA);
      glBindTexture(GL_TEXTURE_RECTANGLE, texID[currId]);
      blendShader.Use();
        DrawFullScreenQuad();
      blendShader.UnUse();
      glDisable(GL_BLEND);
  6. In the final step, restore the default draw buffer (GL_BACK_LEFT) and disable alpha blending and depth testing. Use a full-screen quad and a final shader (Chapter6/FrontToBackPeeling/shaders/final.frag) to blend the output from the color blending FBO.
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
      glDrawBuffer(GL_BACK_LEFT);
      glDisable(GL_DEPTH_TEST);	
      glDisable(GL_BLEND);
    
      glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
      finalShader.Use();
        glUniform4fv(finalShader("vBackgroundColor"), 1, &bg.x);
        DrawFullScreenQuad();
      finalShader.UnUse();

How it works…

The front-to-back depth peeling works in three steps. First, the scene is rendered normally on a depth FBO with depth testing enabled. This ensures that the scene depth values are stored in the depth attachment of the FBO. In the second pass, we bind the depth FBO, bind the depth texture from the first step, and then iteratively clip parts of the geometry by using a fragment shader (see Chapter6/FrontToBackPeeling/shaders/front_peel.frag) as shown in the following code snippet:

#version 330 core
layout(location = 0) out vec4 vFragColor;
uniform vec4 vColor;
uniform sampler2DRect  depthTexture;
void main() {
  float frontDepth = texture(depthTexture, gl_FragCoord.xy).r;
  if(gl_FragCoord.z <= frontDepth)
    discard;
  vFragColor = vColor;
}

This shader simply compares the incoming fragment's depth against the depth value stored in the depth texture. If the current fragment's depth is less than or equal to the depth in the depth texture, the fragment is discarded. Otherwise, the fragment color is output.

float frontDepth = texture(depthTexture, gl_FragCoord.xy).r;
if(gl_FragCoord.z <= frontDepth)
  discard;

After this step, we bind the color blend FBO, disable depth test, and then enable alpha blending with separate blending of colors and alpha values. The glBlendFunctionSeparate function is used here as it enables us to handle color and alpha channels for source and destination separately. The first parameter is the source RGB, which is assigned the alpha value of the pixel in the frame buffer. This blends the incoming fragment with the existing color in the frame buffer. The second parameter, that is, the destination RGB, is set as GL_ONE, which keeps the value in the destination as is. The third parameter is set as GL_ZERO, which removes the source alpha component as we already applied the alpha from the destination using the first parameter. The final parameter, that is, the destination alpha is set as the conventional over-compositing alpha value (GL_ONE_MINUS_SRC_ALPHA).

We then bind the texture from the previous step output and then use the blend shader (see Chapter6/FrontToBackPeeling/shaders/blend.frag) on a full-screen quad to alpha blend the current fragments with the existing fragments on the frame buffer. The blend shader is defined as follows:

#version 330 core
uniform sampler2DRect tempTexture; 
layout(location = 0) out vec4 vFragColor;
void main() {
  vFragColor = texture(tempTexture, gl_FragCoord.xy);
}

The tempTexture sampler contains the output from the depth peeling step stored in the colorBlenderFBO attachment. After this step, the alpha blending is disabled, as shown in the code snippet in step 6 of the How to do it... section.

In the final step, the default draw buffer is restored, depth testing and alpha blending is disabled, and the final output from the color blend FBO is blended with the background color using a simple fragment shader. The code snippet is as shown in step 7 of the How to do it... section. The final fragment shader is defined as follows:

#version 330 core
uniform sampler2DRect colorTexture;
uniform vec4 vBackgroundColor;
layout(location = 0) out vec4 vFragColor;
void main() {
  vec4 color = texture(colorTexture, gl_FragCoord.xy);
  vFragColor = color + vBackgroundColor*color.a;
}

The final shader takes the front peeled result and blends it with the background color using the alpha value from the front peeled result. This way rather than taking the nearest depth fragment all fragments are taken into consideration showing a correctly blended result.

There's more…

The output from the demo application for this recipe renders 27 translucent cubes at the origin. The camera position can be changed using the left mouse button. The front-to-back depth peeling gives the following output. Note the blended color, for example, the yellow color where the green boxes overlay the red ones.

Pressing the Space bar disables front-to-back peeling so that we can see the normal alpha blending without back-to-front sorting which gives the following output. Note that we do not see the yellow blended color where the green and red boxes overlap.

Even though the output produced by front-to-back peeling is correct, it requires multiple passes through the geometry that incur additional processing overhead. The next recipe details the more robust method called dual depth peeling which tackles this problem.

See also

How it works…

The front-to-back

depth peeling works in three steps. First, the scene is rendered normally on a depth FBO with depth testing enabled. This ensures that the scene depth values are stored in the depth attachment of the FBO. In the second pass, we bind the depth FBO, bind the depth texture from the first step, and then iteratively clip parts of the geometry by using a fragment shader (see Chapter6/FrontToBackPeeling/shaders/front_peel.frag) as shown in the following code snippet:

#version 330 core
layout(location = 0) out vec4 vFragColor;
uniform vec4 vColor;
uniform sampler2DRect  depthTexture;
void main() {
  float frontDepth = texture(depthTexture, gl_FragCoord.xy).r;
  if(gl_FragCoord.z <= frontDepth)
    discard;
  vFragColor = vColor;
}

This shader simply compares the incoming fragment's depth against the depth value stored in the depth texture. If the current fragment's depth is less than or equal to the depth in the depth texture, the fragment is discarded. Otherwise, the fragment color is output.

float frontDepth = texture(depthTexture, gl_FragCoord.xy).r;
if(gl_FragCoord.z <= frontDepth)
  discard;

After this step, we bind the color blend FBO, disable depth test, and then enable alpha blending with separate blending of colors and alpha values. The glBlendFunctionSeparate function is used here as it enables us to handle color and alpha channels for source and destination separately. The first parameter is the source RGB, which is assigned the alpha value of the pixel in the frame buffer. This blends the incoming fragment with the existing color in the frame buffer. The second parameter, that is, the destination RGB, is set as GL_ONE, which keeps the value in the destination as is. The third parameter is set as GL_ZERO, which removes the source alpha component as we already applied the alpha from the destination using the first parameter. The final parameter, that is, the destination alpha is set as the conventional over-compositing alpha value (GL_ONE_MINUS_SRC_ALPHA).

We then bind the texture from the previous step output and then use the blend shader (see Chapter6/FrontToBackPeeling/shaders/blend.frag) on a full-screen quad to alpha blend the current fragments with the existing fragments on the frame buffer. The blend shader is defined as follows:

#version 330 core
uniform sampler2DRect tempTexture; 
layout(location = 0) out vec4 vFragColor;
void main() {
  vFragColor = texture(tempTexture, gl_FragCoord.xy);
}

The tempTexture sampler contains the output from the depth peeling step stored in the colorBlenderFBO attachment. After this step, the alpha blending is disabled, as shown in the code snippet in step 6 of the How to do it... section.

In the final step, the default draw buffer is restored, depth testing and alpha blending is disabled, and the final output from the color blend FBO is blended with the background color using a simple fragment shader. The code snippet is as shown in step 7 of the How to do it... section. The final fragment shader is defined as follows:

#version 330 core
uniform sampler2DRect colorTexture;
uniform vec4 vBackgroundColor;
layout(location = 0) out vec4 vFragColor;
void main() {
  vec4 color = texture(colorTexture, gl_FragCoord.xy);
  vFragColor = color + vBackgroundColor*color.a;
}

The final shader takes the front peeled result and blends it with the background color using the alpha value from the front peeled result. This way rather than taking the nearest depth fragment all fragments are taken into consideration showing a correctly blended result.

There's more…

The output from the demo application for this recipe renders 27 translucent cubes at the origin. The camera position can be changed using the left mouse button. The front-to-back depth peeling gives the following output. Note the blended color, for example, the yellow color where the green boxes overlay the red ones.

Pressing the Space bar disables front-to-back peeling so that we can see the normal alpha blending without back-to-front sorting which gives the following output. Note that we do not see the yellow blended color where the green and red boxes overlap.

Even though the output produced by front-to-back peeling is correct, it requires multiple passes through the geometry that incur additional processing overhead. The next recipe details the more robust method called dual depth peeling which tackles this problem.

See also

There's more…

The output from the demo application for this recipe renders 27 translucent cubes at the origin. The camera position can be changed using the left mouse button. The front-to-back depth peeling gives the following output. Note the blended color, for example, the yellow color where the green boxes overlay the red ones.

Pressing the Space bar disables front-to-back peeling so that we can see the normal alpha blending without back-to-front sorting which gives the following output. Note that we do not see the yellow blended color where the green and red boxes overlap.

Even though the output produced by front-to-back peeling is correct, it requires multiple passes through the geometry that incur additional processing overhead. The next recipe details the more robust method called dual depth peeling

which tackles this problem.

See also

See also

Interactive Order-Independent Transparency, Cass Everitt:

Implementing order-independent transparency using dual depth peeling

In this recipe, we will implement dual depth peeling. The main idea behind this method is to peel two depth layers at the same time. This results in a much better performance with the same output, as dual depth peeling peels two layers at a time; one from the front and one from the back.

Getting ready

The code for this recipe is contained in the Chapter6/DualDepthPeeling folder.

How to do it…

The steps required to implement dual depth peeling are as follows:

  1. Create an FBO and attach six textures in all: two for storing the front buffer, two for storing the back buffer, and two for storing the depth buffer values.
    glGenFramebuffers(1, &dualDepthFBOID); 
    glGenTextures (2, texID);
    glGenTextures (2, backTexID);
    glGenTextures (2, depthTexID);
    for(int i=0;i<2;i++) {
    glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[i]);
    //set texture parameters
    glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_FLOAT_RG32_NV, WIDTH, HEIGHT, 0, GL_RGB, GL_FLOAT, NULL);
    glBindTexture(GL_TEXTURE_RECTANGLE,texID[i]);
    //set texture parameters
    glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL);
    glBindTexture(GL_TEXTURE_RECTANGLE,backTexID[i]);
        //set texture parameters
    glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL);
    }
  2. Bind the six textures to the appropriate attachment points on the FBO.
    glBindFramebuffer(GL_FRAMEBUFFER, dualDepthFBOID);	
    for(int i=0;i<2;i++) {
      glFramebufferTexture2D(GL_FRAMEBUFFER, attachID[i], GL_TEXTURE_RECTANGLE, depthTexID[i], 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, attachID[i]+1, GL_TEXTURE_RECTANGLE, texID[i], 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, attachID[i]+2, GL_TEXTURE_RECTANGLE, backTexID[i], 0);
    }
  3. Create another FBO for color blending and attach a new texture to it. Also attach this texture to the first FBO and check the FBO completeness.
      glGenTextures(1, &colorBlenderTexID);
      glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
      //set texture parameters
      glTexImage2D(GL_TEXTURE_RECTANGLE, 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, 0);
      glGenFramebuffers(1, &colorBlenderFBOID);
      glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);	
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_RECTANGLE, colorBlenderTexID, 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT6, GL_TEXTURE_RECTANGLE, colorBlenderTexID, 0);
      GLenum status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
      if(status == GL_FRAMEBUFFER_COMPLETE )
        printf("FBO setup successful !!! \n");
      else
        printf("Problem with FBO setup");
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
  4. In the render function, first disable depth testing and enable blending and then bind the depth FBO. Initialize and clear DrawBuffer to write on the render target attached to GL_COLOR_ATTACHMENT1 and GL_COLOR_ATTACHMENT2.
    glDisable(GL_DEPTH_TEST);
    glEnable(GL_BLEND);
    glBindFramebuffer(GL_FRAMEBUFFER, dualDepthFBOID);
    glDrawBuffers(2, &drawBuffers[1]);
    glClearColor(0, 0, 0, 0);
    glClear(GL_COLOR_BUFFER_BIT);
  5. Next, set GL_COLOR_ATTACHMENT0 as the draw buffer, enable min/max blending (glBlendEquation(GL_MAX)), and initialize the color attachment using fragment shader (see Chapter6/DualDepthPeeling/shaders/dual_init.frag). This completes the first step of dual depth peeling, that is, initialization of the buffers.
        glDrawBuffer(drawBuffers[0]);
        glClearColor(-MAX_DEPTH, -MAX_DEPTH, 0, 0);	
        glClear(GL_COLOR_BUFFER_BIT);
        glBlendEquation(GL_MAX);
        DrawScene(MVP, initShader);
  6. Next, set GL_COLOR_ATTACHMENT6 as the draw buffer and clear it with background color. Then, run a loop that alternates two draw buffers and then uses min/max blending. Then draw the scene again.
    glDrawBuffer(drawBuffers[6]);
    glClearColor(bg.x, bg.y, bg.z, bg.w);
    glClear(GL_COLOR_BUFFER_BIT);
    int numLayers = (NUM_PASSES - 1) * 2;
    int currId = 0;
    for (int layer = 1; bUseOQ || layer < numLayers; layer++) {
      currId = layer % 2;
      int prevId = 1 - currId;
      int bufId = currId * 3;
      glDrawBuffers(2, &drawBuffers[bufId+1]);
      glClearColor(0, 0, 0, 0);
      glClear(GL_COLOR_BUFFER_BIT);
      glDrawBuffer(drawBuffers[bufId+0]);
      glClearColor(-MAX_DEPTH, -MAX_DEPTH, 0, 0);
      glClear(GL_COLOR_BUFFER_BIT);
      glDrawBuffers(3, &drawBuffers[bufId+0]);
      glBlendEquation(GL_MAX);
      glActiveTexture(GL_TEXTURE0);
      glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[prevId]);
      glActiveTexture(GL_TEXTURE1);
      glBindTexture(GL_TEXTURE_RECTANGLE, texID[prevId]);
      DrawScene(MVP, dualPeelShader, true,true);
  7. Finally, enable additive blending (glBlendFunc(GL_FUNC_ADD)) and then draw a full screen quad with the blend shader. This peels away fragments from the front as well as the back layer of the rendered geometry and blends the result on the current draw buffer.
    glDrawBuffer(drawBuffers[6]);
    glBlendEquation(GL_FUNC_ADD);
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
    if (bUseOQ) {
       glBeginQuery(GL_SAMPLES_PASSED_ARB, queryId);
    }
    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_RECTANGLE, backTexID[currId]);
    blendShader.Use();
      DrawFullScreenQuad();
    blendShader.UnUse();
       }
  8. In the final step, we unbind the FBO and enable rendering on the default back buffer (GL_BACK_LEFT). Next, we bind the outputs from the depth peeling and blending steps to their appropriate texture location. Finally, we use a final blending shader to combine the two peeled and blended fragments.
    glBindFramebuffer(GL_FRAMEBUFFER, 0);
    glDrawBuffer(GL_BACK_LEFT);
    glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[currId]);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture(GL_TEXTURE_RECTANGLE, texID[currId]);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
    finalShader.Use(); 
      DrawFullScreenQuad();
    finalShader.UnUse();

How it works…

Dual depth peeling works in a similar fashion as the front-to-back peeling. However, the difference is in the way it operates. It peels away depths from both the front and the back layer at the same time using min/max blending. First, we initialize the fragment depth values using the fragment shader (Chapter6/DualDepthPeeling/shaders/dual_init.frag) and min/max blending.

vFragColor.xy = vec2(-gl_FragCoord.z, gl_FragCoord.z);

This initializes the blending buffers. Next, a loop is run but instead of peeling depth layers front-to-back, we first peel back depths and then the front depths. This is carried out in the fragment shader (Chapter6/DualDepthPeeling/shaders/dual_peel.frag) along with max blending.

float fragDepth = gl_FragCoord.z;
vec2 depthBlender = texture(depthBlenderTex, gl_FragCoord.xy).xy;
vec4 forwardTemp = texture(frontBlenderTex, gl_FragCoord.xy);
//initialize variables …
if (fragDepth < nearestDepth || fragDepth > farthestDepth) {
   vFragColor0.xy = vec2(-MAX_DEPTH);
   return;
}
if(fragDepth > nearestDepth && fragDepth < farthestDepth) {
  vFragColor0.xy = vec2(-fragDepth, fragDepth);
  return;
}
vFragColor0.xy = vec2(-MAX_DEPTH);

if (fragDepth == nearestDepth) {
  vFragColor1.xyz += vColor.rgb * alpha * alphaMultiplier;
  vFragColor1.w = 1.0 - alphaMultiplier * (1.0 - alpha);
} else {
  vFragColor2 += vec4(vColor.rgb,alpha);
}

The blend shader (Chapter6/DualDepthPeeling/shaders/blend.frag) simply discards fragments whose alpha values are zero. This ensures that the occlusion query is not incremented, which would give a wrong number of samples than the actual fragment used in the depth blending.

vFragColor = texture(tempTexture, gl_FragCoord.xy);
if(vFragColor.a == 0)
  discard;

Finally, the last blend shader (Chapter6/DualDepthPeeling/shaders/final.frag) takes the blended fragments from the front and back blend textures and blends the results to get the final fragment color.

vec4 frontColor = texture(frontBlenderTex, gl_FragCoord.xy);
vec3 backColor = texture(backBlenderTex, gl_FragCoord.xy).rgb;
vFragColor.rgb = frontColor.rgb + backColor * frontColor.a;

There's more…

The demo application for this demo is similar to the one shown in the previous recipe. If dual depth peeling is enabled, we get the result as shown in the following figure:

Pressing the Space bar enables/disables dual depth peeling. If dual peeling is disabled, the result is as follows:

See also

Getting ready

The code for this recipe is contained in the Chapter6/DualDepthPeeling folder.

How to do it…

The steps required to implement dual depth peeling are as follows:

  1. Create an FBO and attach six textures in all: two for storing the front buffer, two for storing the back buffer, and two for storing the depth buffer values.
    glGenFramebuffers(1, &dualDepthFBOID); 
    glGenTextures (2, texID);
    glGenTextures (2, backTexID);
    glGenTextures (2, depthTexID);
    for(int i=0;i<2;i++) {
    glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[i]);
    //set texture parameters
    glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_FLOAT_RG32_NV, WIDTH, HEIGHT, 0, GL_RGB, GL_FLOAT, NULL);
    glBindTexture(GL_TEXTURE_RECTANGLE,texID[i]);
    //set texture parameters
    glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL);
    glBindTexture(GL_TEXTURE_RECTANGLE,backTexID[i]);
        //set texture parameters
    glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL);
    }
  2. Bind the six textures to the appropriate attachment points on the FBO.
    glBindFramebuffer(GL_FRAMEBUFFER, dualDepthFBOID);	
    for(int i=0;i<2;i++) {
      glFramebufferTexture2D(GL_FRAMEBUFFER, attachID[i], GL_TEXTURE_RECTANGLE, depthTexID[i], 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, attachID[i]+1, GL_TEXTURE_RECTANGLE, texID[i], 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, attachID[i]+2, GL_TEXTURE_RECTANGLE, backTexID[i], 0);
    }
  3. Create another FBO for color blending and attach a new texture to it. Also attach this texture to the first FBO and check the FBO completeness.
      glGenTextures(1, &colorBlenderTexID);
      glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
      //set texture parameters
      glTexImage2D(GL_TEXTURE_RECTANGLE, 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, 0);
      glGenFramebuffers(1, &colorBlenderFBOID);
      glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);	
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_RECTANGLE, colorBlenderTexID, 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT6, GL_TEXTURE_RECTANGLE, colorBlenderTexID, 0);
      GLenum status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
      if(status == GL_FRAMEBUFFER_COMPLETE )
        printf("FBO setup successful !!! \n");
      else
        printf("Problem with FBO setup");
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
  4. In the render function, first disable depth testing and enable blending and then bind the depth FBO. Initialize and clear DrawBuffer to write on the render target attached to GL_COLOR_ATTACHMENT1 and GL_COLOR_ATTACHMENT2.
    glDisable(GL_DEPTH_TEST);
    glEnable(GL_BLEND);
    glBindFramebuffer(GL_FRAMEBUFFER, dualDepthFBOID);
    glDrawBuffers(2, &drawBuffers[1]);
    glClearColor(0, 0, 0, 0);
    glClear(GL_COLOR_BUFFER_BIT);
  5. Next, set GL_COLOR_ATTACHMENT0 as the draw buffer, enable min/max blending (glBlendEquation(GL_MAX)), and initialize the color attachment using fragment shader (see Chapter6/DualDepthPeeling/shaders/dual_init.frag). This completes the first step of dual depth peeling, that is, initialization of the buffers.
        glDrawBuffer(drawBuffers[0]);
        glClearColor(-MAX_DEPTH, -MAX_DEPTH, 0, 0);	
        glClear(GL_COLOR_BUFFER_BIT);
        glBlendEquation(GL_MAX);
        DrawScene(MVP, initShader);
  6. Next, set GL_COLOR_ATTACHMENT6 as the draw buffer and clear it with background color. Then, run a loop that alternates two draw buffers and then uses min/max blending. Then draw the scene again.
    glDrawBuffer(drawBuffers[6]);
    glClearColor(bg.x, bg.y, bg.z, bg.w);
    glClear(GL_COLOR_BUFFER_BIT);
    int numLayers = (NUM_PASSES - 1) * 2;
    int currId = 0;
    for (int layer = 1; bUseOQ || layer < numLayers; layer++) {
      currId = layer % 2;
      int prevId = 1 - currId;
      int bufId = currId * 3;
      glDrawBuffers(2, &drawBuffers[bufId+1]);
      glClearColor(0, 0, 0, 0);
      glClear(GL_COLOR_BUFFER_BIT);
      glDrawBuffer(drawBuffers[bufId+0]);
      glClearColor(-MAX_DEPTH, -MAX_DEPTH, 0, 0);
      glClear(GL_COLOR_BUFFER_BIT);
      glDrawBuffers(3, &drawBuffers[bufId+0]);
      glBlendEquation(GL_MAX);
      glActiveTexture(GL_TEXTURE0);
      glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[prevId]);
      glActiveTexture(GL_TEXTURE1);
      glBindTexture(GL_TEXTURE_RECTANGLE, texID[prevId]);
      DrawScene(MVP, dualPeelShader, true,true);
  7. Finally, enable additive blending (glBlendFunc(GL_FUNC_ADD)) and then draw a full screen quad with the blend shader. This peels away fragments from the front as well as the back layer of the rendered geometry and blends the result on the current draw buffer.
    glDrawBuffer(drawBuffers[6]);
    glBlendEquation(GL_FUNC_ADD);
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
    if (bUseOQ) {
       glBeginQuery(GL_SAMPLES_PASSED_ARB, queryId);
    }
    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_RECTANGLE, backTexID[currId]);
    blendShader.Use();
      DrawFullScreenQuad();
    blendShader.UnUse();
       }
  8. In the final step, we unbind the FBO and enable rendering on the default back buffer (GL_BACK_LEFT). Next, we bind the outputs from the depth peeling and blending steps to their appropriate texture location. Finally, we use a final blending shader to combine the two peeled and blended fragments.
    glBindFramebuffer(GL_FRAMEBUFFER, 0);
    glDrawBuffer(GL_BACK_LEFT);
    glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[currId]);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture(GL_TEXTURE_RECTANGLE, texID[currId]);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
    finalShader.Use(); 
      DrawFullScreenQuad();
    finalShader.UnUse();

How it works…

Dual depth peeling works in a similar fashion as the front-to-back peeling. However, the difference is in the way it operates. It peels away depths from both the front and the back layer at the same time using min/max blending. First, we initialize the fragment depth values using the fragment shader (Chapter6/DualDepthPeeling/shaders/dual_init.frag) and min/max blending.

vFragColor.xy = vec2(-gl_FragCoord.z, gl_FragCoord.z);

This initializes the blending buffers. Next, a loop is run but instead of peeling depth layers front-to-back, we first peel back depths and then the front depths. This is carried out in the fragment shader (Chapter6/DualDepthPeeling/shaders/dual_peel.frag) along with max blending.

float fragDepth = gl_FragCoord.z;
vec2 depthBlender = texture(depthBlenderTex, gl_FragCoord.xy).xy;
vec4 forwardTemp = texture(frontBlenderTex, gl_FragCoord.xy);
//initialize variables …
if (fragDepth < nearestDepth || fragDepth > farthestDepth) {
   vFragColor0.xy = vec2(-MAX_DEPTH);
   return;
}
if(fragDepth > nearestDepth && fragDepth < farthestDepth) {
  vFragColor0.xy = vec2(-fragDepth, fragDepth);
  return;
}
vFragColor0.xy = vec2(-MAX_DEPTH);

if (fragDepth == nearestDepth) {
  vFragColor1.xyz += vColor.rgb * alpha * alphaMultiplier;
  vFragColor1.w = 1.0 - alphaMultiplier * (1.0 - alpha);
} else {
  vFragColor2 += vec4(vColor.rgb,alpha);
}

The blend shader (Chapter6/DualDepthPeeling/shaders/blend.frag) simply discards fragments whose alpha values are zero. This ensures that the occlusion query is not incremented, which would give a wrong number of samples than the actual fragment used in the depth blending.

vFragColor = texture(tempTexture, gl_FragCoord.xy);
if(vFragColor.a == 0)
  discard;

Finally, the last blend shader (Chapter6/DualDepthPeeling/shaders/final.frag) takes the blended fragments from the front and back blend textures and blends the results to get the final fragment color.

vec4 frontColor = texture(frontBlenderTex, gl_FragCoord.xy);
vec3 backColor = texture(backBlenderTex, gl_FragCoord.xy).rgb;
vFragColor.rgb = frontColor.rgb + backColor * frontColor.a;

There's more…

The demo application for this demo is similar to the one shown in the previous recipe. If dual depth peeling is enabled, we get the result as shown in the following figure:

Pressing the Space bar enables/disables dual depth peeling. If dual peeling is disabled, the result is as follows:

See also

How to do it…

The steps required to implement dual depth peeling are as follows:

Create an FBO and attach six textures in all: two for storing the front buffer, two for storing the back buffer, and two for storing the depth buffer values.
glGenFramebuffers(1, &dualDepthFBOID); 
glGenTextures (2, texID);
glGenTextures (2, backTexID);
glGenTextures (2, depthTexID);
for(int i=0;i<2;i++) {
glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[i]);
//set texture parameters
glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_FLOAT_RG32_NV, WIDTH, HEIGHT, 0, GL_RGB, GL_FLOAT, NULL);
glBindTexture(GL_TEXTURE_RECTANGLE,texID[i]);
//set texture parameters
glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL);
glBindTexture(GL_TEXTURE_RECTANGLE,backTexID[i]);
    //set texture parameters
glTexImage2D(GL_TEXTURE_RECTANGLE , 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, NULL);
}
Bind the
  1. six textures to the appropriate attachment points on the FBO.
    glBindFramebuffer(GL_FRAMEBUFFER, dualDepthFBOID);	
    for(int i=0;i<2;i++) {
      glFramebufferTexture2D(GL_FRAMEBUFFER, attachID[i], GL_TEXTURE_RECTANGLE, depthTexID[i], 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, attachID[i]+1, GL_TEXTURE_RECTANGLE, texID[i], 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, attachID[i]+2, GL_TEXTURE_RECTANGLE, backTexID[i], 0);
    }
  2. Create another FBO for color blending and attach a new texture to it. Also attach this texture to the first FBO and check the FBO completeness.
      glGenTextures(1, &colorBlenderTexID);
      glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
      //set texture parameters
      glTexImage2D(GL_TEXTURE_RECTANGLE, 0, GL_RGBA, WIDTH, HEIGHT, 0, GL_RGBA, GL_FLOAT, 0);
      glGenFramebuffers(1, &colorBlenderFBOID);
      glBindFramebuffer(GL_FRAMEBUFFER, colorBlenderFBOID);	
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_RECTANGLE, colorBlenderTexID, 0);
      glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT6, GL_TEXTURE_RECTANGLE, colorBlenderTexID, 0);
      GLenum status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
      if(status == GL_FRAMEBUFFER_COMPLETE )
        printf("FBO setup successful !!! \n");
      else
        printf("Problem with FBO setup");
      glBindFramebuffer(GL_FRAMEBUFFER, 0);
  3. In the render function, first disable depth testing and enable blending and then bind the depth FBO. Initialize and clear DrawBuffer to write on the render target attached to GL_COLOR_ATTACHMENT1 and GL_COLOR_ATTACHMENT2.
    glDisable(GL_DEPTH_TEST);
    glEnable(GL_BLEND);
    glBindFramebuffer(GL_FRAMEBUFFER, dualDepthFBOID);
    glDrawBuffers(2, &drawBuffers[1]);
    glClearColor(0, 0, 0, 0);
    glClear(GL_COLOR_BUFFER_BIT);
  4. Next, set GL_COLOR_ATTACHMENT0 as the draw buffer, enable min/max blending (glBlendEquation(GL_MAX)), and initialize the color attachment using fragment shader (see Chapter6/DualDepthPeeling/shaders/dual_init.frag). This completes the first step of dual depth peeling, that is, initialization of the buffers.
        glDrawBuffer(drawBuffers[0]);
        glClearColor(-MAX_DEPTH, -MAX_DEPTH, 0, 0);	
        glClear(GL_COLOR_BUFFER_BIT);
        glBlendEquation(GL_MAX);
        DrawScene(MVP, initShader);
  5. Next, set GL_COLOR_ATTACHMENT6 as the draw buffer and clear it with background color. Then, run a loop that alternates two draw buffers and then uses min/max blending. Then draw the scene again.
    glDrawBuffer(drawBuffers[6]);
    glClearColor(bg.x, bg.y, bg.z, bg.w);
    glClear(GL_COLOR_BUFFER_BIT);
    int numLayers = (NUM_PASSES - 1) * 2;
    int currId = 0;
    for (int layer = 1; bUseOQ || layer < numLayers; layer++) {
      currId = layer % 2;
      int prevId = 1 - currId;
      int bufId = currId * 3;
      glDrawBuffers(2, &drawBuffers[bufId+1]);
      glClearColor(0, 0, 0, 0);
      glClear(GL_COLOR_BUFFER_BIT);
      glDrawBuffer(drawBuffers[bufId+0]);
      glClearColor(-MAX_DEPTH, -MAX_DEPTH, 0, 0);
      glClear(GL_COLOR_BUFFER_BIT);
      glDrawBuffers(3, &drawBuffers[bufId+0]);
      glBlendEquation(GL_MAX);
      glActiveTexture(GL_TEXTURE0);
      glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[prevId]);
      glActiveTexture(GL_TEXTURE1);
      glBindTexture(GL_TEXTURE_RECTANGLE, texID[prevId]);
      DrawScene(MVP, dualPeelShader, true,true);
  6. Finally, enable additive blending (glBlendFunc(GL_FUNC_ADD)) and then draw a full screen quad with the blend shader. This peels away fragments from the front as well as the back layer of the rendered geometry and blends the result on the current draw buffer.
    glDrawBuffer(drawBuffers[6]);
    glBlendEquation(GL_FUNC_ADD);
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
    if (bUseOQ) {
       glBeginQuery(GL_SAMPLES_PASSED_ARB, queryId);
    }
    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_RECTANGLE, backTexID[currId]);
    blendShader.Use();
      DrawFullScreenQuad();
    blendShader.UnUse();
       }
  7. In the final step, we unbind the FBO and enable rendering on the default back buffer (GL_BACK_LEFT). Next, we bind the outputs from the depth peeling and blending steps to their appropriate texture location. Finally, we use a final blending shader to combine the two peeled and blended fragments.
    glBindFramebuffer(GL_FRAMEBUFFER, 0);
    glDrawBuffer(GL_BACK_LEFT);
    glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_RECTANGLE, depthTexID[currId]);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture(GL_TEXTURE_RECTANGLE, texID[currId]);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture(GL_TEXTURE_RECTANGLE, colorBlenderTexID);
    finalShader.Use(); 
      DrawFullScreenQuad();
    finalShader.UnUse();

How it works…

Dual depth peeling works in a similar fashion as the front-to-back peeling. However, the difference is in the way it operates. It peels away depths from both the front and the back layer at the same time using min/max blending. First, we initialize the fragment depth values using the fragment shader (Chapter6/DualDepthPeeling/shaders/dual_init.frag) and min/max blending.

vFragColor.xy = vec2(-gl_FragCoord.z, gl_FragCoord.z);

This initializes the blending buffers. Next, a loop is run but instead of peeling depth layers front-to-back, we first peel back depths and then the front depths. This is carried out in the fragment shader (Chapter6/DualDepthPeeling/shaders/dual_peel.frag) along with max blending.

float fragDepth = gl_FragCoord.z;
vec2 depthBlender = texture(depthBlenderTex, gl_FragCoord.xy).xy;
vec4 forwardTemp = texture(frontBlenderTex, gl_FragCoord.xy);
//initialize variables …
if (fragDepth < nearestDepth || fragDepth > farthestDepth) {
   vFragColor0.xy = vec2(-MAX_DEPTH);
   return;
}
if(fragDepth > nearestDepth && fragDepth < farthestDepth) {
  vFragColor0.xy = vec2(-fragDepth, fragDepth);
  return;
}
vFragColor0.xy = vec2(-MAX_DEPTH);

if (fragDepth == nearestDepth) {
  vFragColor1.xyz += vColor.rgb * alpha * alphaMultiplier;
  vFragColor1.w = 1.0 - alphaMultiplier * (1.0 - alpha);
} else {
  vFragColor2 += vec4(vColor.rgb,alpha);
}

The blend shader (Chapter6/DualDepthPeeling/shaders/blend.frag) simply discards fragments whose alpha values are zero. This ensures that the occlusion query is not incremented, which would give a wrong number of samples than the actual fragment used in the depth blending.

vFragColor = texture(tempTexture, gl_FragCoord.xy);
if(vFragColor.a == 0)
  discard;

Finally, the last blend shader (Chapter6/DualDepthPeeling/shaders/final.frag) takes the blended fragments from the front and back blend textures and blends the results to get the final fragment color.

vec4 frontColor = texture(frontBlenderTex, gl_FragCoord.xy);
vec3 backColor = texture(backBlenderTex, gl_FragCoord.xy).rgb;
vFragColor.rgb = frontColor.rgb + backColor * frontColor.a;

There's more…

The demo application for this demo is similar to the one shown in the previous recipe. If dual depth peeling is enabled, we get the result as shown in the following figure:

Pressing the Space bar enables/disables dual depth peeling. If dual peeling is disabled, the result is as follows:

See also

How it works…

Dual depth peeling

works in a similar fashion as the front-to-back peeling. However, the difference is in the way it operates. It peels away depths from both the front and the back layer at the same time using min/max blending. First, we initialize the fragment depth values using the fragment shader (Chapter6/DualDepthPeeling/shaders/dual_init.frag) and min/max blending.

vFragColor.xy = vec2(-gl_FragCoord.z, gl_FragCoord.z);

This initializes the blending buffers. Next, a loop is run but instead of peeling depth layers front-to-back, we first peel back depths and then the front depths. This is carried out in the fragment shader (Chapter6/DualDepthPeeling/shaders/dual_peel.frag) along with max blending.

float fragDepth = gl_FragCoord.z;
vec2 depthBlender = texture(depthBlenderTex, gl_FragCoord.xy).xy;
vec4 forwardTemp = texture(frontBlenderTex, gl_FragCoord.xy);
//initialize variables …
if (fragDepth < nearestDepth || fragDepth > farthestDepth) {
   vFragColor0.xy = vec2(-MAX_DEPTH);
   return;
}
if(fragDepth > nearestDepth && fragDepth < farthestDepth) {
  vFragColor0.xy = vec2(-fragDepth, fragDepth);
  return;
}
vFragColor0.xy = vec2(-MAX_DEPTH);

if (fragDepth == nearestDepth) {
  vFragColor1.xyz += vColor.rgb * alpha * alphaMultiplier;
  vFragColor1.w = 1.0 - alphaMultiplier * (1.0 - alpha);
} else {
  vFragColor2 += vec4(vColor.rgb,alpha);
}

The blend shader (Chapter6/DualDepthPeeling/shaders/blend.frag) simply discards fragments whose alpha values are zero. This ensures that the occlusion query is not incremented, which would give a wrong number of samples than the actual fragment used in the depth blending.

vFragColor = texture(tempTexture, gl_FragCoord.xy);
if(vFragColor.a == 0)
  discard;

Finally, the last blend shader (Chapter6/DualDepthPeeling/shaders/final.frag) takes the blended fragments from the front and back blend textures and blends the results to get the final fragment color.

vec4 frontColor = texture(frontBlenderTex, gl_FragCoord.xy);
vec3 backColor = texture(backBlenderTex, gl_FragCoord.xy).rgb;
vFragColor.rgb = frontColor.rgb + backColor * frontColor.a;

There's more…

The demo application for this demo is similar to the one shown in the previous recipe. If dual depth peeling is enabled, we get the result as shown in the following figure:

Pressing the Space bar enables/disables dual depth peeling. If dual peeling is disabled, the result is as follows:

See also

There's more…

The demo application for this demo is similar to the one shown in the previous recipe. If dual depth peeling is enabled, we get the result as shown in the following figure:

Pressing the Space bar enables/disables dual depth peeling. If dual peeling is disabled, the result is as follows:

See also

See also

Louis Bavoil and Kevin Myers, Order Independent Transparency with Dual Depth Peeling demo in NVIDIA OpenGL 10 sdk:

Implementing screen space ambient occlusion (SSAO)

We have implemented simple lighting recipes in previous chapters. These unfortunately approximate some aspects of lighting. However, effects such as global illumination are not handled by the basic lights, as discussed earlier. In this respect, several techniques have been developed over the years which fake the global illumination effects. One such technique is Screen Space Ambient Occlusion (SSAO) which we will implement in this recipe.

As the name suggests, this method works in screen space. For any given pixel onscreen, the amount of occlusion due to its neighboring pixels can be obtained by looking at the difference in their depth value. In order to reduce the sampling artefacts, the neighbor coordinates are randomly offset. For a pixel whose depth values are close to one another, they belong to the geometry which is spatially lying close. Based on the difference of the depth values, an occlusion value is determined. Given in pseudocode, the algorithm may be given as follows:

Get the position (p), normal (n) and depth (d) value at current pixel position 
For each pixel in the neighborhood of current pixel
    Get the position (p0) of the neighborhood pixel 
    Call proc. CalcAO(p, p0, n)
End for
Return the ambient occlusion amount as color

The ambient occlusion procedure is defined as follows:

const float DEPTH_TOLERANCE = 0.00001;
proc CalcAO(p,p0,n)
   diff = p0-p-DEPTH_TOLERANCE;
   v = normalize(diff);
   d = length(diff)*scale;
   return max(0.1, dot(n,v)-bias)*(1.0/(1.0+d))*intensity;
end proc

Note that we have three artist control parameters: scale, bias, and intensity. The scale parameter controls the size of the occlusion area, bias shifts the occlusion, and intensity controls the strength of the occlusion. The DEPTH_TOLERANCE constant is added to remove depth-fighting artefacts.

The whole recipe proceeds as follows. We load our 3D model and render it into an offscreen texture using FBO. We use two FBOs: one for storing the eye space normals and depth, and another FBO is for filtering of intermediate results. For both the color attachment and the depth attachment of first FBO, floating point texture formats are used. For the color attachment, GL_RGBA32F is used, whereas for depth texture, the GL_DEPTH_COMPONENT32F floating point format is used. Floating point texture formats are used as we require more precision, otherwise truncation errors will show up in the rendering result. The second FBO is used for separable Gaussian smoothing as was carried out in the Implementing variance shadow mapping recipe in Chapter 4, Lights and Shadows. This FBO has two color attachments with the floating point texture format GL_RGBA32F.

In the rendering function, the scene is first rendered normally. Then, the first shader is used to output the eye space normals. This is stored in the color attachment and the depth values are stored in the depth attachment of the first FBO. After this step, the filtering FBO is bound and the second shader is used, which uses the depth and normal textures from the first FBO to calculate the ambient occlusion result. Since the neighbor points are randomly offset, noise is introduced. The noisy result is then smoothed by applying separable gaussian smoothing. Finally, the filtered result is blended with the existing rendering by using conventional alpha blending.

Getting ready

The code for this recipe is contained in the Chapter6/SSAO folder. We will be using the Obj model viewer from Chapter 5, Mesh Model Formats and and Particle Systems. We will add SSAO to the Obj model.

How to do it…

Let us start the recipe by following these simple steps:

  1. Create a global reference of the ObjLoader object. Call the ObjLoader::Load function passing it the name of the OBJ file. Pass vectors to store the meshes, vertices, indices, and materials contained in the OBJ file.
  2. Create a framebuffer object (FBO) with two attachments: first to store the scene normals and second to store the depth. We will use a floating point texture format (GL_RGBA32F) for both of these. In addition, we create a second FBO for Gaussian smoothing of the SSAO output. We are using multiple texture units here as the second shader expects normal and depth textures to be bound to texture units 1 and 3 respectively.
    glGenFramebuffers(1, &fboID);
    glBindFramebuffer(GL_FRAMEBUFFER, fboID);
    glGenTextures(1, &normalTextureID);
    glGenTextures(1, &depthTextureID);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture(GL_TEXTURE_2D, normalTextureID);
    //set texture parameters
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, WIDTH, HEIGHT, 0, GL_BGRA, GL_FLOAT, NULL);
    glActiveTexture(GL_TEXTURE3);
    glBindTexture(GL_TEXTURE_2D, depthTextureID);
    //set texture parameters
    glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT32F, WIDTH, HEIGHT, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, normalTextureID, 0);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,GL_TEXTURE_2D, depthTextureID, 0);
    glGenFramebuffers(1,&filterFBOID);
    glBindFramebuffer(GL_FRAMEBUFFER,filterFBOID);
    glGenTextures(2, blurTexID);
    for(int i=0;i<2;i++) {
        glActiveTexture(GL_TEXTURE4+i);
        glBindTexture(GL_TEXTURE_2D, blurTexID[i]);
        //set texture parameters
        glTexImage2D(GL_TEXTURE_2D,0,GL_RGBA32F,RTT_WIDTH, RTT_HEIGHT,0,GL_RGBA,GL_FLOAT,NULL);
        glFramebufferTexture2D(GL_FRAMEBUFFER,GL_COLOR_ATTACHMENT0+i,GL_TEXTURE_2D,blurTexID[i],0);
    }
  3. In the render function, render the scene meshes normally. After this step, bind the first FBO and then use the first shader program. This program takes the per-vertex positions/normals of the mesh and outputs the view space normals from the fragment shader.
    glBindFramebuffer(GL_FRAMEBUFFER, fboID);
    glViewport(0,0,RTT_WIDTH, RTT_HEIGHT);
    glDrawBuffer(GL_COLOR_ATTACHMENT0); 
    glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);
    glBindVertexArray(vaoID); {
    ssaoFirstShader.Use();	
    glUniformMatrix4fv(ssaoFirstShader("MVP"), 1, GL_FALSE, glm::value_ptr(P*MV));	
    glUniformMatrix3fv(ssaoFirstShader("N"), 1, GL_FALSE, glm::value_ptr(glm::inverseTranspose(glm::mat3(MV))));
    for(size_t i=0;i<materials.size();i++) {
    Material* pMat = materials[i];
    if(materials.size()==1)
      glDrawElements(GL_TRIANGLES, indices.size(), GL_UNSIGNED_SHORT, 0);
    else
      glDrawElements(GL_TRIANGLES, pMat->count, GL_UNSIGNED_SHORT, (const GLvoid*)(&indices[pMat->offset]));
    }
    ssaoFirstShader.UnUse();
    }

    The first vertex shader (Chapter6/SSAO/shaders/SSAO_FirstStep.vert) outputs the eye space normal as shown in the following code snippet:

    #version 330 core
    layout(location = 0) in vec3 vVertex;
    layout(location = 1) in vec3 vNormal; 
    uniform mat4 MVP;
    uniform mat3 N;
    smooth out vec3 vEyeSpaceNormal;
    void main() {
        vEyeSpaceNormal = N*vNormal;
        gl_Position = MVP*vec4(vVertex,1);
    }

    The fragment shader (Chapter6/SSAO/shaders/SSAO_FirstStep.frag) returns the interpolated normal, as the fragment color, shown as follows:

    #version 330 core
    smooth in vec3 vEyeSpaceNormal;
    layout(location=0) out vec4 vFragColor;
    void main() {
        vFragColor = vec4(normalize(vEyeSpaceNormal)*0.5 + 0.5, 1);
    }
  4. Bind the filtering FBO and use the second shader (Chapter6/SSAO/shaders/SSAO_SecondStep.frag). This shader does the actual SSAO calculation. The input to the shader is the normals texture from step 3. This shader is invoked on a full screen quad.
    glBindFramebuffer(GL_FRAMEBUFFER,filterFBOID);
    glDrawBuffer(GL_COLOR_ATTACHMENT0);
    glBindVertexArray(quadVAOID);
    ssaoSecondShader.Use();
    glUniform1f(ssaoSecondShader("radius"), sampling_radius);
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
    ssaoSecondShader.UnUse();
  5. Filter the output from step 4 by using separable Gaussian convolution using two fragment shaders (Chapter6/SSAO/shaders/GaussH.frag and Chapter6/SSAO/shaders/GaussV.frag). The separable Gaussian smoothing is added in to smooth out the ambient occlusion result.
    glDrawBuffer(GL_COLOR_ATTACHMENT1);
    glBindVertexArray(quadVAOID);
    gaussianV_shader.Use();
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
    glDrawBuffer(GL_COLOR_ATTACHMENT0);
    gaussianH_shader.Use();
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
  6. Unbind the filtering FBO, reset the default viewport, and then the default draw buffer. Enable alpha blending and then use the final shader (Chapter6/SSAO/shaders/final.frag) to blend the output from steps 3 and 5. This shader simply renders the final output from the filtering stage using a full-screen quad.
    glBindFramebuffer(GL_FRAMEBUFFER,0);  
    glViewport(0,0,WIDTH, HEIGHT);
    glDrawBuffer(GL_BACK_LEFT);	
    glEnable(GL_BLEND);  
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);  
    finalShader.Use();
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0); 
    finalShader.UnUse();
    glDisable(GL_BLEND);

How it works…

There are three steps in the SSAO calculation. The first step is the preparation of inputs, that is, the view space normals and depth. The normals are stored using the first step vertex shader (Chapter6/SSAO/shaders/SSAO_FirstStep.vert).

vEyeSpaceNormal_Depth = N*vNormal;
vec4 esPos = MV*vec4(vVertex,1);
gl_Position = P*esPos;

The fragment shader (Chapter6/SSAO/shaders/SSAO_FirstStep.frag) then outputs these values. The depth is extracted from the depth attachment of the FBO.

The second step is the actual SSAO calculation. We use a fragment shader (Chapter6/SSAO/shaders/SSAO_SecondStep.frag) to perform this by first rendering a screen-aligned quad. Then, for each fragment, the corresponding normal and depth values are obtained from the render target, from the first step. Next, a loop is run to compare the depth values of the neighboring fragments and then an occlusion value is estimated.

float depth = texture(depthTex, vUV).r; 
if(depth<1.0)
{

    vec3 n = normalize(texture(normalTex, vUV).xyz*2.0 - 1.0);
    vec4 p = invP*vec4(vUV,depth,1);
    p.xyz /= p.w;

    vec2 random = normalize(texture(noiseTex, viewportSize/random_size * vUV).rg * 2.0 - 1.0);
    float ao = 0.0;

    for(int i = 0; i < NUM_SAMPLES; i++)
    {
      float npw = (pw + radius * samples[i].x * random.x);
      float nph = (ph + radius * samples[i].y * random.y);

      vec2 uv = vUV + vec2(npw, nph);
      vec4 p0 = invP * vec4(vUV,texture2D(depthTex, uv ).r, 1.0);
      p0.xyz /= p0.w;
      ao += calcAO(p0, p, n);
      //calculate similar depth points from the neighborhood 
      //and calcualte ambient occlusion amount
    }
    ao *= INV_NUM_SAMPLES/8.0;

    vFragColor = vec4(vec3(0), ao);
}

After the second shader, we filter the SSAO output using separable Gaussian convolution. The default draw buffer is then restored and then the Gaussian filtered SSAO output is alpha blended with the normal rendering.

There's more…

The demo application implementing this recipe shows the scene with three blocks on a planar quad. When run, the output is as shown in the following screenshot:

Pressing the Space bar disables SSAO to produce the following output. As can be seen, ambient occlusion helps in giving shaded cues that approximate how near or far objects are. We can also change the sampling radius by using the + and - keys.

Getting ready

The code for this

recipe is contained in the Chapter6/SSAO folder. We will be using the Obj model viewer from Chapter 5, Mesh Model Formats and and Particle Systems. We will add SSAO to the Obj model.

How to do it…

Let us start the recipe by following these simple steps:

  1. Create a global reference of the ObjLoader object. Call the ObjLoader::Load function passing it the name of the OBJ file. Pass vectors to store the meshes, vertices, indices, and materials contained in the OBJ file.
  2. Create a framebuffer object (FBO) with two attachments: first to store the scene normals and second to store the depth. We will use a floating point texture format (GL_RGBA32F) for both of these. In addition, we create a second FBO for Gaussian smoothing of the SSAO output. We are using multiple texture units here as the second shader expects normal and depth textures to be bound to texture units 1 and 3 respectively.
    glGenFramebuffers(1, &fboID);
    glBindFramebuffer(GL_FRAMEBUFFER, fboID);
    glGenTextures(1, &normalTextureID);
    glGenTextures(1, &depthTextureID);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture(GL_TEXTURE_2D, normalTextureID);
    //set texture parameters
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, WIDTH, HEIGHT, 0, GL_BGRA, GL_FLOAT, NULL);
    glActiveTexture(GL_TEXTURE3);
    glBindTexture(GL_TEXTURE_2D, depthTextureID);
    //set texture parameters
    glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT32F, WIDTH, HEIGHT, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, normalTextureID, 0);
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,GL_TEXTURE_2D, depthTextureID, 0);
    glGenFramebuffers(1,&filterFBOID);
    glBindFramebuffer(GL_FRAMEBUFFER,filterFBOID);
    glGenTextures(2, blurTexID);
    for(int i=0;i<2;i++) {
        glActiveTexture(GL_TEXTURE4+i);
        glBindTexture(GL_TEXTURE_2D, blurTexID[i]);
        //set texture parameters
        glTexImage2D(GL_TEXTURE_2D,0,GL_RGBA32F,RTT_WIDTH, RTT_HEIGHT,0,GL_RGBA,GL_FLOAT,NULL);
        glFramebufferTexture2D(GL_FRAMEBUFFER,GL_COLOR_ATTACHMENT0+i,GL_TEXTURE_2D,blurTexID[i],0);
    }
  3. In the render function, render the scene meshes normally. After this step, bind the first FBO and then use the first shader program. This program takes the per-vertex positions/normals of the mesh and outputs the view space normals from the fragment shader.
    glBindFramebuffer(GL_FRAMEBUFFER, fboID);
    glViewport(0,0,RTT_WIDTH, RTT_HEIGHT);
    glDrawBuffer(GL_COLOR_ATTACHMENT0); 
    glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);
    glBindVertexArray(vaoID); {
    ssaoFirstShader.Use();	
    glUniformMatrix4fv(ssaoFirstShader("MVP"), 1, GL_FALSE, glm::value_ptr(P*MV));	
    glUniformMatrix3fv(ssaoFirstShader("N"), 1, GL_FALSE, glm::value_ptr(glm::inverseTranspose(glm::mat3(MV))));
    for(size_t i=0;i<materials.size();i++) {
    Material* pMat = materials[i];
    if(materials.size()==1)
      glDrawElements(GL_TRIANGLES, indices.size(), GL_UNSIGNED_SHORT, 0);
    else
      glDrawElements(GL_TRIANGLES, pMat->count, GL_UNSIGNED_SHORT, (const GLvoid*)(&indices[pMat->offset]));
    }
    ssaoFirstShader.UnUse();
    }

    The first vertex shader (Chapter6/SSAO/shaders/SSAO_FirstStep.vert) outputs the eye space normal as shown in the following code snippet:

    #version 330 core
    layout(location = 0) in vec3 vVertex;
    layout(location = 1) in vec3 vNormal; 
    uniform mat4 MVP;
    uniform mat3 N;
    smooth out vec3 vEyeSpaceNormal;
    void main() {
        vEyeSpaceNormal = N*vNormal;
        gl_Position = MVP*vec4(vVertex,1);
    }

    The fragment shader (Chapter6/SSAO/shaders/SSAO_FirstStep.frag) returns the interpolated normal, as the fragment color, shown as follows:

    #version 330 core
    smooth in vec3 vEyeSpaceNormal;
    layout(location=0) out vec4 vFragColor;
    void main() {
        vFragColor = vec4(normalize(vEyeSpaceNormal)*0.5 + 0.5, 1);
    }
  4. Bind the filtering FBO and use the second shader (Chapter6/SSAO/shaders/SSAO_SecondStep.frag). This shader does the actual SSAO calculation. The input to the shader is the normals texture from step 3. This shader is invoked on a full screen quad.
    glBindFramebuffer(GL_FRAMEBUFFER,filterFBOID);
    glDrawBuffer(GL_COLOR_ATTACHMENT0);
    glBindVertexArray(quadVAOID);
    ssaoSecondShader.Use();
    glUniform1f(ssaoSecondShader("radius"), sampling_radius);
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
    ssaoSecondShader.UnUse();
  5. Filter the output from step 4 by using separable Gaussian convolution using two fragment shaders (Chapter6/SSAO/shaders/GaussH.frag and Chapter6/SSAO/shaders/GaussV.frag). The separable Gaussian smoothing is added in to smooth out the ambient occlusion result.
    glDrawBuffer(GL_COLOR_ATTACHMENT1);
    glBindVertexArray(quadVAOID);
    gaussianV_shader.Use();
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
    glDrawBuffer(GL_COLOR_ATTACHMENT0);
    gaussianH_shader.Use();
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
  6. Unbind the filtering FBO, reset the default viewport, and then the default draw buffer. Enable alpha blending and then use the final shader (Chapter6/SSAO/shaders/final.frag) to blend the output from steps 3 and 5. This shader simply renders the final output from the filtering stage using a full-screen quad.
    glBindFramebuffer(GL_FRAMEBUFFER,0);  
    glViewport(0,0,WIDTH, HEIGHT);
    glDrawBuffer(GL_BACK_LEFT);	
    glEnable(GL_BLEND);  
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);  
    finalShader.Use();
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0); 
    finalShader.UnUse();
    glDisable(GL_BLEND);

How it works…

There are three steps in the SSAO calculation. The first step is the preparation of inputs, that is, the view space normals and depth. The normals are stored using the first step vertex shader (Chapter6/SSAO/shaders/SSAO_FirstStep.vert).

vEyeSpaceNormal_Depth = N*vNormal;
vec4 esPos = MV*vec4(vVertex,1);
gl_Position = P*esPos;

The fragment shader (Chapter6/SSAO/shaders/SSAO_FirstStep.frag) then outputs these values. The depth is extracted from the depth attachment of the FBO.

The second step is the actual SSAO calculation. We use a fragment shader (Chapter6/SSAO/shaders/SSAO_SecondStep.frag) to perform this by first rendering a screen-aligned quad. Then, for each fragment, the corresponding normal and depth values are obtained from the render target, from the first step. Next, a loop is run to compare the depth values of the neighboring fragments and then an occlusion value is estimated.

float depth = texture(depthTex, vUV).r; 
if(depth<1.0)
{

    vec3 n = normalize(texture(normalTex, vUV).xyz*2.0 - 1.0);
    vec4 p = invP*vec4(vUV,depth,1);
    p.xyz /= p.w;

    vec2 random = normalize(texture(noiseTex, viewportSize/random_size * vUV).rg * 2.0 - 1.0);
    float ao = 0.0;

    for(int i = 0; i < NUM_SAMPLES; i++)
    {
      float npw = (pw + radius * samples[i].x * random.x);
      float nph = (ph + radius * samples[i].y * random.y);

      vec2 uv = vUV + vec2(npw, nph);
      vec4 p0 = invP * vec4(vUV,texture2D(depthTex, uv ).r, 1.0);
      p0.xyz /= p0.w;
      ao += calcAO(p0, p, n);
      //calculate similar depth points from the neighborhood 
      //and calcualte ambient occlusion amount
    }
    ao *= INV_NUM_SAMPLES/8.0;

    vFragColor = vec4(vec3(0), ao);
}

After the second shader, we filter the SSAO output using separable Gaussian convolution. The default draw buffer is then restored and then the Gaussian filtered SSAO output is alpha blended with the normal rendering.

There's more…

The demo application implementing this recipe shows the scene with three blocks on a planar quad. When run, the output is as shown in the following screenshot:

Pressing the Space bar disables SSAO to produce the following output. As can be seen, ambient occlusion helps in giving shaded cues that approximate how near or far objects are. We can also change the sampling radius by using the + and - keys.

How to do it…

Let us start the recipe by following these simple steps:

Create a global reference of the ObjLoader object. Call the ObjLoader::Load function passing it the name of the OBJ file. Pass vectors to store the meshes, vertices, indices, and materials contained in the OBJ file.
Create a framebuffer object (FBO) with two attachments: first to store the scene normals and second to store the depth. We will use a floating point texture format (GL_RGBA32F) for both of these. In addition, we create a second FBO for Gaussian smoothing of the SSAO output. We are using multiple texture units here as the second shader expects normal and depth textures to be bound to texture units 1 and 3 respectively.
glGenFramebuffers(1, &fboID);
glBindFramebuffer(GL_FRAMEBUFFER, fboID);
glGenTextures(1, &normalTextureID);
glGenTextures(1, &depthTextureID);
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, normalTextureID);
//set texture parameters
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, WIDTH, HEIGHT, 0, GL_BGRA, GL_FLOAT, NULL);
glActiveTexture(GL_TEXTURE3);
glBindTexture(GL_TEXTURE_2D, depthTextureID);
//set texture parameters
glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT32F, WIDTH, HEIGHT, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, normalTextureID, 0);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,GL_TEXTURE_2D, depthTextureID, 0);
glGenFramebuffers(1,&filterFBOID);
glBindFramebuffer(GL_FRAMEBUFFER,filterFBOID);
glGenTextures(2, blurTexID);
for(int i=0;i<2;i++) {
    glActiveTexture(GL_TEXTURE4+i);
    glBindTexture(GL_TEXTURE_2D, blurTexID[i]);
    //set texture parameters
    glTexImage2D(GL_TEXTURE_2D,0,GL_RGBA32F,RTT_WIDTH, RTT_HEIGHT,0,GL_RGBA,GL_FLOAT,NULL);
    glFramebufferTexture2D(GL_FRAMEBUFFER,GL_COLOR_ATTACHMENT0+i,GL_TEXTURE_2D,blurTexID[i],0);
}
In the render
  1. function, render the scene meshes normally. After this step, bind the first FBO and then use the first shader program. This program takes the per-vertex positions/normals of the mesh and outputs the view space normals from the fragment shader.
    glBindFramebuffer(GL_FRAMEBUFFER, fboID);
    glViewport(0,0,RTT_WIDTH, RTT_HEIGHT);
    glDrawBuffer(GL_COLOR_ATTACHMENT0); 
    glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);
    glBindVertexArray(vaoID); {
    ssaoFirstShader.Use();	
    glUniformMatrix4fv(ssaoFirstShader("MVP"), 1, GL_FALSE, glm::value_ptr(P*MV));	
    glUniformMatrix3fv(ssaoFirstShader("N"), 1, GL_FALSE, glm::value_ptr(glm::inverseTranspose(glm::mat3(MV))));
    for(size_t i=0;i<materials.size();i++) {
    Material* pMat = materials[i];
    if(materials.size()==1)
      glDrawElements(GL_TRIANGLES, indices.size(), GL_UNSIGNED_SHORT, 0);
    else
      glDrawElements(GL_TRIANGLES, pMat->count, GL_UNSIGNED_SHORT, (const GLvoid*)(&indices[pMat->offset]));
    }
    ssaoFirstShader.UnUse();
    }

    The first vertex shader (Chapter6/SSAO/shaders/SSAO_FirstStep.vert) outputs the eye space normal as shown in the following code snippet:

    #version 330 core
    layout(location = 0) in vec3 vVertex;
    layout(location = 1) in vec3 vNormal; 
    uniform mat4 MVP;
    uniform mat3 N;
    smooth out vec3 vEyeSpaceNormal;
    void main() {
        vEyeSpaceNormal = N*vNormal;
        gl_Position = MVP*vec4(vVertex,1);
    }

    The fragment shader (Chapter6/SSAO/shaders/SSAO_FirstStep.frag) returns the interpolated normal, as the fragment color, shown as follows:

    #version 330 core
    smooth in vec3 vEyeSpaceNormal;
    layout(location=0) out vec4 vFragColor;
    void main() {
        vFragColor = vec4(normalize(vEyeSpaceNormal)*0.5 + 0.5, 1);
    }
  2. Bind the filtering FBO and use the second shader (Chapter6/SSAO/shaders/SSAO_SecondStep.frag). This shader does the actual SSAO calculation. The input to the shader is the normals texture from step 3. This shader is invoked on a full screen quad.
    glBindFramebuffer(GL_FRAMEBUFFER,filterFBOID);
    glDrawBuffer(GL_COLOR_ATTACHMENT0);
    glBindVertexArray(quadVAOID);
    ssaoSecondShader.Use();
    glUniform1f(ssaoSecondShader("radius"), sampling_radius);
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
    ssaoSecondShader.UnUse();
  3. Filter the output from step 4 by using separable Gaussian convolution using two fragment shaders (Chapter6/SSAO/shaders/GaussH.frag and Chapter6/SSAO/shaders/GaussV.frag). The separable Gaussian smoothing is added in to smooth out the ambient occlusion result.
    glDrawBuffer(GL_COLOR_ATTACHMENT1);
    glBindVertexArray(quadVAOID);
    gaussianV_shader.Use();
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
    glDrawBuffer(GL_COLOR_ATTACHMENT0);
    gaussianH_shader.Use();
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
  4. Unbind the filtering FBO, reset the default viewport, and then the default draw buffer. Enable alpha blending and then use the final shader (Chapter6/SSAO/shaders/final.frag) to blend the output from steps 3 and 5. This shader simply renders the final output from the filtering stage using a full-screen quad.
    glBindFramebuffer(GL_FRAMEBUFFER,0);  
    glViewport(0,0,WIDTH, HEIGHT);
    glDrawBuffer(GL_BACK_LEFT);	
    glEnable(GL_BLEND);  
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);  
    finalShader.Use();
    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0); 
    finalShader.UnUse();
    glDisable(GL_BLEND);

How it works…

There are three steps in the SSAO calculation. The first step is the preparation of inputs, that is, the view space normals and depth. The normals are stored using the first step vertex shader (Chapter6/SSAO/shaders/SSAO_FirstStep.vert).

vEyeSpaceNormal_Depth = N*vNormal;
vec4 esPos = MV*vec4(vVertex,1);
gl_Position = P*esPos;

The fragment shader (Chapter6/SSAO/shaders/SSAO_FirstStep.frag) then outputs these values. The depth is extracted from the depth attachment of the FBO.

The second step is the actual SSAO calculation. We use a fragment shader (Chapter6/SSAO/shaders/SSAO_SecondStep.frag) to perform this by first rendering a screen-aligned quad. Then, for each fragment, the corresponding normal and depth values are obtained from the render target, from the first step. Next, a loop is run to compare the depth values of the neighboring fragments and then an occlusion value is estimated.

float depth = texture(depthTex, vUV).r; 
if(depth<1.0)
{

    vec3 n = normalize(texture(normalTex, vUV).xyz*2.0 - 1.0);
    vec4 p = invP*vec4(vUV,depth,1);
    p.xyz /= p.w;

    vec2 random = normalize(texture(noiseTex, viewportSize/random_size * vUV).rg * 2.0 - 1.0);
    float ao = 0.0;

    for(int i = 0; i < NUM_SAMPLES; i++)
    {
      float npw = (pw + radius * samples[i].x * random.x);
      float nph = (ph + radius * samples[i].y * random.y);

      vec2 uv = vUV + vec2(npw, nph);
      vec4 p0 = invP * vec4(vUV,texture2D(depthTex, uv ).r, 1.0);
      p0.xyz /= p0.w;
      ao += calcAO(p0, p, n);
      //calculate similar depth points from the neighborhood 
      //and calcualte ambient occlusion amount
    }
    ao *= INV_NUM_SAMPLES/8.0;

    vFragColor = vec4(vec3(0), ao);
}

After the second shader, we filter the SSAO output using separable Gaussian convolution. The default draw buffer is then restored and then the Gaussian filtered SSAO output is alpha blended with the normal rendering.

There's more…

The demo application implementing this recipe shows the scene with three blocks on a planar quad. When run, the output is as shown in the following screenshot:

Pressing the Space bar disables SSAO to produce the following output. As can be seen, ambient occlusion helps in giving shaded cues that approximate how near or far objects are. We can also change the sampling radius by using the + and - keys.

How it works…

There are three

steps in the SSAO calculation. The first step is the preparation of inputs, that is, the view space normals and depth. The normals are stored using the first step vertex shader (Chapter6/SSAO/shaders/SSAO_FirstStep.vert).

vEyeSpaceNormal_Depth = N*vNormal;
vec4 esPos = MV*vec4(vVertex,1);
gl_Position = P*esPos;

The fragment shader (Chapter6/SSAO/shaders/SSAO_FirstStep.frag) then outputs these values. The depth is extracted from the depth attachment of the FBO.

The second step is the actual SSAO calculation. We use a fragment shader (Chapter6/SSAO/shaders/SSAO_SecondStep.frag) to perform this by first rendering a screen-aligned quad. Then, for each fragment, the corresponding normal and depth values are obtained from the render target, from the first step. Next, a loop is run to compare the depth values of the neighboring fragments and then an occlusion value is estimated.

float depth = texture(depthTex, vUV).r; 
if(depth<1.0)
{

    vec3 n = normalize(texture(normalTex, vUV).xyz*2.0 - 1.0);
    vec4 p = invP*vec4(vUV,depth,1);
    p.xyz /= p.w;

    vec2 random = normalize(texture(noiseTex, viewportSize/random_size * vUV).rg * 2.0 - 1.0);
    float ao = 0.0;

    for(int i = 0; i < NUM_SAMPLES; i++)
    {
      float npw = (pw + radius * samples[i].x * random.x);
      float nph = (ph + radius * samples[i].y * random.y);

      vec2 uv = vUV + vec2(npw, nph);
      vec4 p0 = invP * vec4(vUV,texture2D(depthTex, uv ).r, 1.0);
      p0.xyz /= p0.w;
      ao += calcAO(p0, p, n);
      //calculate similar depth points from the neighborhood 
      //and calcualte ambient occlusion amount
    }
    ao *= INV_NUM_SAMPLES/8.0;

    vFragColor = vec4(vec3(0), ao);
}

After the second shader, we filter the SSAO output using separable Gaussian convolution. The default draw buffer is then restored and then the Gaussian filtered SSAO output is alpha blended with the normal rendering.

There's more…

The demo application implementing this recipe shows the scene with three blocks on a planar quad. When run, the output is as shown in the following screenshot:

Pressing the Space bar disables SSAO to produce the following output. As can be seen, ambient occlusion helps in giving shaded cues that approximate how near or far objects are. We can also change the sampling radius by using the + and - keys.

There's more…

The demo application implementing this recipe shows the scene with three blocks on a planar quad. When run, the output is as shown in the following screenshot:

Pressing the Space bar

disables SSAO to produce the following output. As can be seen, ambient occlusion helps in giving shaded cues that approximate how near or far objects are. We can also change the sampling radius by using the + and - keys.

See also

A Simple and Practical Approach to SSAO by Jose Maria Mendez:

Implementing global illumination using spherical harmonics lighting

In this recipe, we will learn about implementing simple global illumination using spherical harmonics. Spherical harmonics is a class of methods that enable approximation of functions as a product of a set of coefficients with a set of basis functions. Rather than calculating the lighting contribution by evaluating the bi-directional reflectance distribution function (BRDF), this method uses special HDR/RGBE images that store the lighting information. The only attribute required for this method is the per-vertex normal. These are multiplied with the spherical harmonics coefficients that are extracted from the HDR/RGBE images.

The RGBE image format was invented by Greg Ward. These images store three bytes for the RGB value (that is, the red, green, and blue channel) and an additional byte which stores a shared exponent. This enables these files to have an extended range and precision of floating point values. For details about the theory behind the spherical harmonics method and the RGBE format, refer to the references in the See also section of this recipe.

To give an overview of the recipe, using the probe image, the SH coefficients (C1 to C5) are estimated by projection. Details of the projection method are given in the references in the See also section. For most of the common lighting HDR probes, the spherical harmonic coefficients are documented. We use these values as constants in our vertex shader.

Getting ready

The code for this recipe is contained in the Chapter6/SphericalHarmonics directory. For this recipe, we will be using the Obj mesh loader discussed in the previous chapter.

How to do it…

Let us start this recipe by following these simple steps:

  1. Load an obj mesh using the ObjLoader class and fill the OpenGL buffer objects and the OpenGL textures, using the material information loaded from the file, as in the previous recipes.
  2. In the vertex shader that is used for the mesh, perform the lighting calculation using spherical harmonics. The vertex shader is detailed as follows:
    #version 330 core
    layout(location = 0) in vec3 vVertex;
    layout(location = 1) in vec3 vNormal;
    layout(location = 2) in vec2 vUV;
    
    smooth out vec2 vUVout;
    smooth out vec4 diffuse;
    
    uniform mat4 P; 
    uniform mat4 MV;
    uniform mat3 N;
    
    const float C1 = 0.429043;
    const float C2 = 0.511664;
    const float C3 = 0.743125;
    const float C4 = 0.886227;
    const float C5 = 0.247708;
    const float PI = 3.1415926535897932384626433832795;
    
    //Old town square probe
    const vec3 L00 = vec3( 0.871297, 0.875222, 0.864470);
    const vec3 L1m1 = vec3( 0.175058, 0.245335, 0.312891);
    const vec3 L10 = vec3( 0.034675, 0.036107, 0.037362);
    const vec3 L11 = vec3(-0.004629, -0.029448, -0.048028);
    const vec3 L2m2 = vec3(-0.120535, -0.121160, -0.117507);
    const vec3 L2m1 = vec3( 0.003242, 0.003624, 0.007511);
    const vec3 L20 = vec3(-0.028667, -0.024926, -0.020998);
    const vec3 L21 = vec3(-0.077539, -0.086325, -0.091591);
    const vec3 L22 = vec3(-0.161784, -0.191783, -0.219152);
    const vec3 scaleFactor = vec3(0.161784/ (0.871297+0.161784), 0.191783/(0.875222+0.191783), 0.219152/(0.864470+0.219152));
    
    void main()
    {
        vUVout=vUV;
        vec3 tmpN = normalize(N*vNormal);
        vec3 diff = C1 * L22 * (tmpN.x*tmpN.x - tmpN.y*tmpN.y) + C3 * L20 * tmpN.z*tmpN.z + C4 * L00 - C5 * L20 + 2.0 * C1 * L2m2*tmpN.x*tmpN.y + 2.0 * C1 * L21*tmpN.x*tmpN.z + 2.0 * C1 * L2m1*tmpN.y*tmpN.z + 2.0 * C2 * L11*tmpN.x +2.0 * C2 * L1m1*tmpN.y +2.0 * C2 * L10*tmpN.z;
        diff *= scaleFactor;
        diffuse = vec4(diff, 1);
        gl_Position = P*(MV*vec4(vVertex,1));
    }
  3. The per-vertex color calculated by the vertex shader is interpolated by the rasterizer and then the fragment shader sets the color as the current fragment color.
    #version 330 core
    uniform sampler2D textureMap;
    uniform float useDefault;
    smooth in vec4 diffuse;
    smooth in vec2 vUVout;
    layout(location=0) out vec4 vFragColor;
    void main() {
        vFragColor = mix(texture(textureMap, vUVout)*diffuse, diffuse, useDefault);
    }

How it works…

Spherical harmonics is a technique that approximates the lighting, using coefficients and spherical harmonics basis. The coefficients are obtained at initialization from an HDR/RGBE image file that contains information about lighting. This allows us to approximate the same light so the graphical scene feels more immersive.

The method reproduces accurate diffuse reflection using information extracted from an HDR/RGBE light probe. The light probe itself is not accessed in the code. The spherical harmonics basis and coefficients are extracted from the original light probe using projection. Since this is a mathematically involved process, we refer the interested readers to the references in the See also section. The code for generating the spherical harmonics coefficients is available online. We used this code to generate the spherical harmonics coefficients for the shader.

The spherical harmonics is a frequency space representation of an image on a sphere. As was shown by Ramamoorthi and Hanrahan, only the first nine spherical harmonic coefficients are enough to give a reasonable approximation of the diffuse reflection component of a surface. These coefficients are obtained by constant, linear, and quadratic polynomial interpolation of the surface normal. The interpolation result gives us the diffuse component which has to be normalized by a scale factor which is obtained by summing all of the coefficients as shown in the following code snippet:

vec3 tmpN = normalize(N*vNormal);
vec3 diff = C1 * L22 * (tmpN.x*tmpN.x - tmpN.y*tmpN.y) + C3 * L20 * tmpN.z*tmpN.z + C4 * L00 – C5 * L20 + 2.0 * C1 * L2m2*tmpN.x*tmpN.y +2.0 * C1 * L21*tmpN.x*tmpN.z + 2.0 * C1 * L2m1*tmpN.y*tmpN.z + 2.0 * C2 * L11*tmpN.x +2.0 * C2 * L1m1*tmpN.y +2.0 * C2 * L10*tmpN.z;diff *= scaleFactor;

The obtained per-vertex diffuse component is then forwarded through the rasterizer to the fragment shader where it is directly multiplied by the texture of the surface.

vFragColor = mix(texture(textureMap, vUVout)*diffuse, diffuse, useDefault);

There's more…

The demo application implementing this recipe renders the same scene as in the previous recipes, as shown in the following figure. We can rotate the camera view using the left mouse button, whereas, the point light source can be rotated using the right mouse button. Pressing the Space bar toggles the use of spherical harmonics. When spherical harmonics lighting is on, we get the following result:

Without the spherical harmonics lighting, the result is as follows:

The probe image used for this image is shown in the following figure:

Note that this method approximates global illumination by modifying the diffuse component using the spherical harmonics coefficients. We can also add the conventional Blinn Phong lighting model as we did in the earlier recipes. For that we would only need to evaluate the Blinn Phong lighting model using the normal and light position, as we did in the previous recipe.

See also

Getting ready

The code for this recipe is contained in the Chapter6/SphericalHarmonics directory. For this recipe, we will be using the Obj mesh loader discussed in the previous chapter.

How to do it…

Let us start this recipe by following these simple steps:

  1. Load an obj mesh using the ObjLoader class and fill the OpenGL buffer objects and the OpenGL textures, using the material information loaded from the file, as in the previous recipes.
  2. In the vertex shader that is used for the mesh, perform the lighting calculation using spherical harmonics. The vertex shader is detailed as follows:
    #version 330 core
    layout(location = 0) in vec3 vVertex;
    layout(location = 1) in vec3 vNormal;
    layout(location = 2) in vec2 vUV;
    
    smooth out vec2 vUVout;
    smooth out vec4 diffuse;
    
    uniform mat4 P; 
    uniform mat4 MV;
    uniform mat3 N;
    
    const float C1 = 0.429043;
    const float C2 = 0.511664;
    const float C3 = 0.743125;
    const float C4 = 0.886227;
    const float C5 = 0.247708;
    const float PI = 3.1415926535897932384626433832795;
    
    //Old town square probe
    const vec3 L00 = vec3( 0.871297, 0.875222, 0.864470);
    const vec3 L1m1 = vec3( 0.175058, 0.245335, 0.312891);
    const vec3 L10 = vec3( 0.034675, 0.036107, 0.037362);
    const vec3 L11 = vec3(-0.004629, -0.029448, -0.048028);
    const vec3 L2m2 = vec3(-0.120535, -0.121160, -0.117507);
    const vec3 L2m1 = vec3( 0.003242, 0.003624, 0.007511);
    const vec3 L20 = vec3(-0.028667, -0.024926, -0.020998);
    const vec3 L21 = vec3(-0.077539, -0.086325, -0.091591);
    const vec3 L22 = vec3(-0.161784, -0.191783, -0.219152);
    const vec3 scaleFactor = vec3(0.161784/ (0.871297+0.161784), 0.191783/(0.875222+0.191783), 0.219152/(0.864470+0.219152));
    
    void main()
    {
        vUVout=vUV;
        vec3 tmpN = normalize(N*vNormal);
        vec3 diff = C1 * L22 * (tmpN.x*tmpN.x - tmpN.y*tmpN.y) + C3 * L20 * tmpN.z*tmpN.z + C4 * L00 - C5 * L20 + 2.0 * C1 * L2m2*tmpN.x*tmpN.y + 2.0 * C1 * L21*tmpN.x*tmpN.z + 2.0 * C1 * L2m1*tmpN.y*tmpN.z + 2.0 * C2 * L11*tmpN.x +2.0 * C2 * L1m1*tmpN.y +2.0 * C2 * L10*tmpN.z;
        diff *= scaleFactor;
        diffuse = vec4(diff, 1);
        gl_Position = P*(MV*vec4(vVertex,1));
    }
  3. The per-vertex color calculated by the vertex shader is interpolated by the rasterizer and then the fragment shader sets the color as the current fragment color.
    #version 330 core
    uniform sampler2D textureMap;
    uniform float useDefault;
    smooth in vec4 diffuse;
    smooth in vec2 vUVout;
    layout(location=0) out vec4 vFragColor;
    void main() {
        vFragColor = mix(texture(textureMap, vUVout)*diffuse, diffuse, useDefault);
    }

How it works…

Spherical harmonics is a technique that approximates the lighting, using coefficients and spherical harmonics basis. The coefficients are obtained at initialization from an HDR/RGBE image file that contains information about lighting. This allows us to approximate the same light so the graphical scene feels more immersive.

The method reproduces accurate diffuse reflection using information extracted from an HDR/RGBE light probe. The light probe itself is not accessed in the code. The spherical harmonics basis and coefficients are extracted from the original light probe using projection. Since this is a mathematically involved process, we refer the interested readers to the references in the See also section. The code for generating the spherical harmonics coefficients is available online. We used this code to generate the spherical harmonics coefficients for the shader.

The spherical harmonics is a frequency space representation of an image on a sphere. As was shown by Ramamoorthi and Hanrahan, only the first nine spherical harmonic coefficients are enough to give a reasonable approximation of the diffuse reflection component of a surface. These coefficients are obtained by constant, linear, and quadratic polynomial interpolation of the surface normal. The interpolation result gives us the diffuse component which has to be normalized by a scale factor which is obtained by summing all of the coefficients as shown in the following code snippet:

vec3 tmpN = normalize(N*vNormal);
vec3 diff = C1 * L22 * (tmpN.x*tmpN.x - tmpN.y*tmpN.y) + C3 * L20 * tmpN.z*tmpN.z + C4 * L00 – C5 * L20 + 2.0 * C1 * L2m2*tmpN.x*tmpN.y +2.0 * C1 * L21*tmpN.x*tmpN.z + 2.0 * C1 * L2m1*tmpN.y*tmpN.z + 2.0 * C2 * L11*tmpN.x +2.0 * C2 * L1m1*tmpN.y +2.0 * C2 * L10*tmpN.z;diff *= scaleFactor;

The obtained per-vertex diffuse component is then forwarded through the rasterizer to the fragment shader where it is directly multiplied by the texture of the surface.

vFragColor = mix(texture(textureMap, vUVout)*diffuse, diffuse, useDefault);

There's more…

The demo application implementing this recipe renders the same scene as in the previous recipes, as shown in the following figure. We can rotate the camera view using the left mouse button, whereas, the point light source can be rotated using the right mouse button. Pressing the Space bar toggles the use of spherical harmonics. When spherical harmonics lighting is on, we get the following result:

Without the spherical harmonics lighting, the result is as follows:

The probe image used for this image is shown in the following figure:

Note that this method approximates global illumination by modifying the diffuse component using the spherical harmonics coefficients. We can also add the conventional Blinn Phong lighting model as we did in the earlier recipes. For that we would only need to evaluate the Blinn Phong lighting model using the normal and light position, as we did in the previous recipe.

See also

How to do it…

Let us start this recipe by following these simple steps:

Load an obj mesh using the ObjLoader class and fill the OpenGL buffer objects and the OpenGL textures, using the material information loaded from the file, as in the previous recipes.
In the vertex shader that is used for the mesh, perform the lighting calculation using spherical harmonics. The vertex shader is detailed as follows:
#version 330 core
layout(location = 0) in vec3 vVertex;
layout(location = 1) in vec3 vNormal;
layout(location = 2) in vec2 vUV;

smooth out vec2 vUVout;
smooth out vec4 diffuse;

uniform mat4 P; 
uniform mat4 MV;
uniform mat3 N;

const float C1 = 0.429043;
const float C2 = 0.511664;
const float C3 = 0.743125;
const float C4 = 0.886227;
const float C5 = 0.247708;
const float PI = 3.1415926535897932384626433832795;

//Old town square probe
const vec3 L00 = vec3( 0.871297, 0.875222, 0.864470);
const vec3 L1m1 = vec3( 0.175058, 0.245335, 0.312891);
const vec3 L10 = vec3( 0.034675, 0.036107, 0.037362);
const vec3 L11 = vec3(-0.004629, -0.029448, -0.048028);
const vec3 L2m2 = vec3(-0.120535, -0.121160, -0.117507);
const vec3 L2m1 = vec3( 0.003242, 0.003624, 0.007511);
const vec3 L20 = vec3(-0.028667, -0.024926, -0.020998);
const vec3 L21 = vec3(-0.077539, -0.086325, -0.091591);
const vec3 L22 = vec3(-0.161784, -0.191783, -0.219152);
const vec3 scaleFactor = vec3(0.161784/ (0.871297+0.161784), 0.191783/(0.875222+0.191783), 0.219152/(0.864470+0.219152));

void main()
{
    vUVout=vUV;
    vec3 tmpN = normalize(N*vNormal);
    vec3 diff = C1 * L22 * (tmpN.x*tmpN.x - tmpN.y*tmpN.y) + C3 * L20 * tmpN.z*tmpN.z + C4 * L00 - C5 * L20 + 2.0 * C1 * L2m2*tmpN.x*tmpN.y + 2.0 * C1 * L21*tmpN.x*tmpN.z + 2.0 * C1 * L2m1*tmpN.y*tmpN.z + 2.0 * C2 * L11*tmpN.x +2.0 * C2 * L1m1*tmpN.y +2.0 * C2 * L10*tmpN.z;
    diff *= scaleFactor;
    diffuse = vec4(diff, 1);
    gl_Position = P*(MV*vec4(vVertex,1));
}
The per-vertex color calculated by the vertex shader is interpolated by the rasterizer and then the fragment shader sets the color as the current fragment color.
#version 330 core
uniform sampler2D textureMap;
uniform float useDefault;
smooth in vec4 diffuse;
smooth in vec2 vUVout;
layout(location=0) out vec4 vFragColor;
void main() {
    vFragColor = mix(texture(textureMap, vUVout)*diffuse, diffuse, useDefault);
}

How it works…

Spherical harmonics is a technique that approximates the lighting, using coefficients and spherical harmonics basis. The coefficients are obtained at initialization from an HDR/RGBE image file that contains information about lighting. This allows us to approximate the same light so the graphical scene feels more immersive.

The method reproduces accurate diffuse reflection using information extracted from an HDR/RGBE light probe. The light probe itself is not accessed in the code. The spherical harmonics basis and coefficients are extracted from the original light probe using projection. Since this is a mathematically involved process, we refer the interested readers to the references in the See also section. The code for generating the spherical harmonics coefficients is available online. We used this code to generate the spherical harmonics coefficients for the shader.

The spherical harmonics is a frequency space representation of an image on a sphere. As was shown by Ramamoorthi and Hanrahan, only the first nine spherical harmonic coefficients are enough to give a reasonable approximation of the diffuse reflection component of a surface. These coefficients are obtained by constant, linear, and quadratic polynomial interpolation of the surface normal. The interpolation result gives us the diffuse component which has to be normalized by a scale factor which is obtained by summing all of the coefficients as shown in the following code snippet:

vec3 tmpN = normalize(N*vNormal);
vec3 diff = C1 * L22 * (tmpN.x*tmpN.x - tmpN.y*tmpN.y) + C3 * L20 * tmpN.z*tmpN.z + C4 * L00 – C5 * L20 + 2.0 * C1 * L2m2*tmpN.x*tmpN.y +2.0 * C1 * L21*tmpN.x*tmpN.z + 2.0 * C1 * L2m1*tmpN.y*tmpN.z + 2.0 * C2 * L11*tmpN.x +2.0 * C2 * L1m1*tmpN.y +2.0 * C2 * L10*tmpN.z;diff *= scaleFactor;

The obtained per-vertex diffuse component is then forwarded through the rasterizer to the fragment shader where it is directly multiplied by the texture of the surface.

vFragColor = mix(texture(textureMap, vUVout)*diffuse, diffuse, useDefault);

There's more…

The demo application implementing this recipe renders the same scene as in the previous recipes, as shown in the following figure. We can rotate the camera view using the left mouse button, whereas, the point light source can be rotated using the right mouse button. Pressing the Space bar toggles the use of spherical harmonics. When spherical harmonics lighting is on, we get the following result:

Without the spherical harmonics lighting, the result is as follows:

The probe image used for this image is shown in the following figure:

Note that this method approximates global illumination by modifying the diffuse component using the spherical harmonics coefficients. We can also add the conventional Blinn Phong lighting model as we did in the earlier recipes. For that we would only need to evaluate the Blinn Phong lighting model using the normal and light position, as we did in the previous recipe.

See also

How it works…

Spherical

harmonics is a technique that approximates the lighting, using coefficients and spherical harmonics basis. The coefficients are obtained at initialization from an HDR/RGBE image file that contains information about lighting. This allows us to approximate the same light so the graphical scene feels more immersive.

The method reproduces accurate diffuse reflection using information extracted from an HDR/RGBE light probe. The light probe itself is not accessed in the code. The spherical harmonics basis and coefficients are extracted from the original light probe using projection. Since this is a mathematically involved process, we refer the interested readers to the references in the See also section. The code for generating the spherical harmonics coefficients is available online. We used this code to generate the spherical harmonics coefficients for the shader.

The spherical harmonics is a frequency space representation of an image on a sphere. As was shown by Ramamoorthi and Hanrahan, only the first nine spherical harmonic coefficients are enough to give a reasonable approximation of the diffuse reflection component of a surface. These coefficients are obtained by constant, linear, and quadratic polynomial interpolation of the surface normal. The interpolation result gives us the diffuse component which has to be normalized by a scale factor which is obtained by summing all of the coefficients as shown in the following code snippet:

vec3 tmpN = normalize(N*vNormal);
vec3 diff = C1 * L22 * (tmpN.x*tmpN.x - tmpN.y*tmpN.y) + C3 * L20 * tmpN.z*tmpN.z + C4 * L00 – C5 * L20 + 2.0 * C1 * L2m2*tmpN.x*tmpN.y +2.0 * C1 * L21*tmpN.x*tmpN.z + 2.0 * C1 * L2m1*tmpN.y*tmpN.z + 2.0 * C2 * L11*tmpN.x +2.0 * C2 * L1m1*tmpN.y +2.0 * C2 * L10*tmpN.z;diff *= scaleFactor;

The obtained per-vertex diffuse component is then forwarded through the rasterizer to the fragment shader where it is directly multiplied by the texture of the surface.

vFragColor = mix(texture(textureMap, vUVout)*diffuse, diffuse, useDefault);

There's more…

The demo application implementing this recipe renders the same scene as in the previous recipes, as shown in the following figure. We can rotate the camera view using the left mouse button, whereas, the point light source can be rotated using the right mouse button. Pressing the Space bar toggles the use of spherical harmonics. When spherical harmonics lighting is on, we get the following result:

Without the spherical harmonics lighting, the result is as follows:

The probe image used for this image is shown in the following figure:

Note that this method approximates global illumination by modifying the diffuse component using the spherical harmonics coefficients. We can also add the conventional Blinn Phong lighting model as we did in the earlier recipes. For that we would only need to evaluate the Blinn Phong lighting model using the normal and light position, as we did in the previous recipe.

See also

There's more…

The demo

application implementing this recipe renders the same scene as in the previous recipes, as shown in the following figure. We can rotate the camera view using the left mouse button, whereas, the point light source can be rotated using the right mouse button. Pressing the Space bar toggles the use of spherical harmonics. When spherical harmonics lighting is on, we get the following result:

Without the spherical harmonics lighting, the result is as follows:

The probe image used for this image is shown in the following figure:

Note that this method approximates global illumination by modifying the diffuse component using the spherical harmonics coefficients. We can also add the conventional Blinn Phong lighting model as we did in the earlier recipes. For that we would only need to evaluate the Blinn Phong lighting model using the normal and light position, as we did in the previous recipe.

See also

See also

Ravi Ramamoorthi and Pat Hanrahan, An Efficient Representation for Irradiance Environment Maps:

Implementing GPU-based ray tracing

To this point, all of the recipes rendered 3D geometry using rasterization. In this recipe, we will implement another method for rendering geometry, which is called ray tracing. Simply put, ray tracing uses a probing ray from the camera position into the graphical scene. The intersections of this ray are obtained for each geometry. The good thing with this method is that only the visible objects are rendered.

The ray tracing algorithm can be given in pseudocode as follows:

For each pixel on screen
  Get the eye ray origin and direction using camera position
  For the amount of traces required
    Cast the ray into scene
    For each object in the scene
      Check eye ray for intersection 
      If intersection found
        Determine the hit point and surface normal 
        For each light source 
          Calculate diffuse and specular comp. at hit point
          Cast shadow ray from hit point to light 
        End For
        Darken diffuse component based on shadow result
        Set the hit point as the new ray origin
        Reflect the eye ray direction at surface normal
      End If
    End For
  End For
End For

Getting ready

The code for this recipe is contained in the Chapter6/GPURaytracing directory.

How to do it…

Let us start with this recipe by following these simple steps:

  1. Load the Obj mesh model using the Obj loader and store mesh geometry in vectors. Note that for the GPU ray tracer we use the original vertices and indices lists stored in the OBJ file.
    vector<unsigned short> indices2;
    vector<glm::vec3> vertices2;
    if(!obj.Load(mesh_filename.c_str(), meshes, vertices, indices, materials, aabb, vertices2, indices2)) { 
      cout<<"Cannot load the 3ds mesh"<<endl;
      exit(EXIT_FAILURE);
    }
  2. Load the material texture maps into an OpenGL texture array instead of loading each texture separately, as in previous recipes. We opted for texture arrays because this helps in simplifying the shader code and we would have no way in determining the total samplers we would require, as that is dependent on the material textures we have in the model. In previous recipes, there was a single texture sampler which was modified for each sub-mesh.
    for(size_t k=0;k<materials.size();k++) {
    if(materials[k]->map_Kd != "") { 
    if(k==0) {
      glGenTextures(1, &textureID);
      glBindTexture(GL_TEXTURE_2D_ARRAY, textureID);
      glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
     //set other texture parameters
    }
    //set image name
    GLubyte* pData = SOIL_load_image(full_filename.c_str(), &texture_width, &texture_height, &channels, SOIL_LOAD_AUTO);
     if(pData == NULL) {
      cerr<<"Cannot load image: "<<full_filename.c_str()<<endl;
      exit(EXIT_FAILURE);
     } 
     //flip the image and set the image format
     if(k==0) {
      glTexImage3D(GL_TEXTURE_2D_ARRAY, 0, format, texture_width, texture_height, total, 0, format, GL_UNSIGNED_BYTE, NULL);
     }
      glTexSubImage3D(GL_TEXTURE_2D_ARRAY, 0,0,0,k, texture_width, texture_height, 1, format, GL_UNSIGNED_BYTE, pData);
      SOIL_free_image_data(pData);
     }
    }
  3. Store the vertex positions into a texture for the ray tracing shader. We use a floating point texture with the GL_RGBA32F internal format.
    glGenTextures(1, &texVerticesID);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture( GL_TEXTURE_2D, texVerticesID);
    //set the texture formats
    GLfloat* pData = new GLfloat[vertices2.size()*4]; 
    int count = 0;
    for(size_t i=0;i<vertices2.size();i++) { 
    pData[count++] = vertices2[i].x;
    pData[count++] = vertices2[i].y;
    pData[count++] = vertices2[i].z; 
    pData[count++] = 0; 
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, vertices2.size(),1, 0, GL_RGBA, GL_FLOAT, pData);
    delete [] pData;
  4. Store the list of indices into an integral texture for the ray tracing shader. Note that for this texture, the internal format is GL_RGBA16I and the format is GL_RGBA_INTEGER.
    glGenTextures(1, &texTrianglesID);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture( GL_TEXTURE_2D, texTrianglesID);
    //set the texture formats
    GLushort* pData2 = new GLushort[indices2.size()];
    count = 0;
    for(size_t i=0;i<indices2.size();i+=4) {
      pData2[count++] = (indices2[i]);
      pData2[count++] = (indices2[i+1]);
      pData2[count++] = (indices2[i+2]); 
      pData2[count++] = (indices2[i+3]);
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16I, indices2.size()/4, 1, 0, GL_RGBA_INTEGER, GL_UNSIGNED_SHORT, pData2); 
    delete [] pData2;
    
  5. In the render function, bind the ray tracing shader and then draw a full-screen quad to invoke the fragment shader for the entire screen.

How it works…

The main code for ray tracing is the ray tracing fragment shader (Chapter6/GPURaytracing/shaders/raytracer.frag). We first set up the camera ray origin and direction using the parameters passed to the shader as shader uniforms.

eyeRay.origin = eyePos;
cam.U = (invMVP*vec4(1,0,0,0)).xyz; 
cam.V = (invMVP*vec4(0,1,0,0)).xyz; 
cam.W = (invMVP*vec4(0,0,1,0)).xyz; 
cam.d = 1;
eyeRay.dir = get_direction(uv , cam);
eyeRay.dir += cam.U*uv.x;
eyeRay.dir += cam.V*uv.y;

After the eye ray is set up, we check the ray against the axially aligned bounding box of the scene. If there is an intersection, we continue further. For this simple example, we use a brute force method of looping through all of the triangles and testing each of them in turn for ray intersection.

In ray tracing, we try to find the neatest intersection of a parametric ray with the given triangle. Any point along the ray is obtained by using a parameter t. We are looking for the nearest intersection (smallest t value). If there is an intersection and it is the closest so far, we store the collision information and the normal at the intersection point. The t parameter gives us the exact position where the intersection occurs.

vec4 val=vec4(t,0,0,0);
vec3 N;
for(int i=0;i<int(TRIANGLE_TEXTURE_SIZE);i++)
{
  vec3 normal;
  vec4 res = intersectTriangle(eyeRay.origin, eyeRay.dir, i, normal); 
  if(res.x>0 && res.x <= val.x) { 
    val = res;
    N = normal;
  }
}

When we plug its value into the parametric equation of a ray, we get the hit point. Then, we calculate a vector to light from the hit point. This vector is then used to estimate the diffuse component and the attenuation amount.

if(val.x != t) {
  vec3 hit = eyeRay.origin + eyeRay.dir*val.x;
  vec3  jitteredLight  =  light_position + uniformlyRandomVector(gl_FragCoord.x);
  vec3 L = (jitteredLight.xyz-hit);
  float d = length(L);
  L = normalize(L);
  float diffuse = max(0, dot(N, L));
  float attenuationAmount = 1.0/(k0 + (k1*d) + (k2*d*d));
  diffuse *= attenuationAmount;

With ray tracing, shadows are very easy to calculate. We simply cast another ray, but this time, just look at if the ray intersects any object on its way to the light source. If it does, we darken the final color, otherwise we leave the color as is. Note that to prevent the shadow acne, we add a slight offset to the ray start position.

  float inShadow = shadow(hit+ N*0.0001, L);
  vFragColor = inShadow*diffuse*mix(texture(textureMaps, val.yzw), vec4(1), (val.w==255) );
  return;
}

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU ray tracing by pressing the Space bar. We can see that the shadows are clearly visible in the ray tracing scene. Note that the performance of GPU ray tracing is directly related to how close or far the object is from the camera, as well as how many triangles are there in the rendered mesh. For better performance, some acceleration structure, such as uniform grid or kd-tree should be employed. Also note, soft shadows require us to cast more shadow rays, which also add additional strain on the ray tracing fragment shader.

See also

Getting ready

The code for this recipe is contained in the Chapter6/GPURaytracing directory.

How to do it…

Let us start with this recipe by following these simple steps:

  1. Load the Obj mesh model using the Obj loader and store mesh geometry in vectors. Note that for the GPU ray tracer we use the original vertices and indices lists stored in the OBJ file.
    vector<unsigned short> indices2;
    vector<glm::vec3> vertices2;
    if(!obj.Load(mesh_filename.c_str(), meshes, vertices, indices, materials, aabb, vertices2, indices2)) { 
      cout<<"Cannot load the 3ds mesh"<<endl;
      exit(EXIT_FAILURE);
    }
  2. Load the material texture maps into an OpenGL texture array instead of loading each texture separately, as in previous recipes. We opted for texture arrays because this helps in simplifying the shader code and we would have no way in determining the total samplers we would require, as that is dependent on the material textures we have in the model. In previous recipes, there was a single texture sampler which was modified for each sub-mesh.
    for(size_t k=0;k<materials.size();k++) {
    if(materials[k]->map_Kd != "") { 
    if(k==0) {
      glGenTextures(1, &textureID);
      glBindTexture(GL_TEXTURE_2D_ARRAY, textureID);
      glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
     //set other texture parameters
    }
    //set image name
    GLubyte* pData = SOIL_load_image(full_filename.c_str(), &texture_width, &texture_height, &channels, SOIL_LOAD_AUTO);
     if(pData == NULL) {
      cerr<<"Cannot load image: "<<full_filename.c_str()<<endl;
      exit(EXIT_FAILURE);
     } 
     //flip the image and set the image format
     if(k==0) {
      glTexImage3D(GL_TEXTURE_2D_ARRAY, 0, format, texture_width, texture_height, total, 0, format, GL_UNSIGNED_BYTE, NULL);
     }
      glTexSubImage3D(GL_TEXTURE_2D_ARRAY, 0,0,0,k, texture_width, texture_height, 1, format, GL_UNSIGNED_BYTE, pData);
      SOIL_free_image_data(pData);
     }
    }
  3. Store the vertex positions into a texture for the ray tracing shader. We use a floating point texture with the GL_RGBA32F internal format.
    glGenTextures(1, &texVerticesID);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture( GL_TEXTURE_2D, texVerticesID);
    //set the texture formats
    GLfloat* pData = new GLfloat[vertices2.size()*4]; 
    int count = 0;
    for(size_t i=0;i<vertices2.size();i++) { 
    pData[count++] = vertices2[i].x;
    pData[count++] = vertices2[i].y;
    pData[count++] = vertices2[i].z; 
    pData[count++] = 0; 
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, vertices2.size(),1, 0, GL_RGBA, GL_FLOAT, pData);
    delete [] pData;
  4. Store the list of indices into an integral texture for the ray tracing shader. Note that for this texture, the internal format is GL_RGBA16I and the format is GL_RGBA_INTEGER.
    glGenTextures(1, &texTrianglesID);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture( GL_TEXTURE_2D, texTrianglesID);
    //set the texture formats
    GLushort* pData2 = new GLushort[indices2.size()];
    count = 0;
    for(size_t i=0;i<indices2.size();i+=4) {
      pData2[count++] = (indices2[i]);
      pData2[count++] = (indices2[i+1]);
      pData2[count++] = (indices2[i+2]); 
      pData2[count++] = (indices2[i+3]);
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16I, indices2.size()/4, 1, 0, GL_RGBA_INTEGER, GL_UNSIGNED_SHORT, pData2); 
    delete [] pData2;
    
  5. In the render function, bind the ray tracing shader and then draw a full-screen quad to invoke the fragment shader for the entire screen.

How it works…

The main code for ray tracing is the ray tracing fragment shader (Chapter6/GPURaytracing/shaders/raytracer.frag). We first set up the camera ray origin and direction using the parameters passed to the shader as shader uniforms.

eyeRay.origin = eyePos;
cam.U = (invMVP*vec4(1,0,0,0)).xyz; 
cam.V = (invMVP*vec4(0,1,0,0)).xyz; 
cam.W = (invMVP*vec4(0,0,1,0)).xyz; 
cam.d = 1;
eyeRay.dir = get_direction(uv , cam);
eyeRay.dir += cam.U*uv.x;
eyeRay.dir += cam.V*uv.y;

After the eye ray is set up, we check the ray against the axially aligned bounding box of the scene. If there is an intersection, we continue further. For this simple example, we use a brute force method of looping through all of the triangles and testing each of them in turn for ray intersection.

In ray tracing, we try to find the neatest intersection of a parametric ray with the given triangle. Any point along the ray is obtained by using a parameter t. We are looking for the nearest intersection (smallest t value). If there is an intersection and it is the closest so far, we store the collision information and the normal at the intersection point. The t parameter gives us the exact position where the intersection occurs.

vec4 val=vec4(t,0,0,0);
vec3 N;
for(int i=0;i<int(TRIANGLE_TEXTURE_SIZE);i++)
{
  vec3 normal;
  vec4 res = intersectTriangle(eyeRay.origin, eyeRay.dir, i, normal); 
  if(res.x>0 && res.x <= val.x) { 
    val = res;
    N = normal;
  }
}

When we plug its value into the parametric equation of a ray, we get the hit point. Then, we calculate a vector to light from the hit point. This vector is then used to estimate the diffuse component and the attenuation amount.

if(val.x != t) {
  vec3 hit = eyeRay.origin + eyeRay.dir*val.x;
  vec3  jitteredLight  =  light_position + uniformlyRandomVector(gl_FragCoord.x);
  vec3 L = (jitteredLight.xyz-hit);
  float d = length(L);
  L = normalize(L);
  float diffuse = max(0, dot(N, L));
  float attenuationAmount = 1.0/(k0 + (k1*d) + (k2*d*d));
  diffuse *= attenuationAmount;

With ray tracing, shadows are very easy to calculate. We simply cast another ray, but this time, just look at if the ray intersects any object on its way to the light source. If it does, we darken the final color, otherwise we leave the color as is. Note that to prevent the shadow acne, we add a slight offset to the ray start position.

  float inShadow = shadow(hit+ N*0.0001, L);
  vFragColor = inShadow*diffuse*mix(texture(textureMaps, val.yzw), vec4(1), (val.w==255) );
  return;
}

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU ray tracing by pressing the Space bar. We can see that the shadows are clearly visible in the ray tracing scene. Note that the performance of GPU ray tracing is directly related to how close or far the object is from the camera, as well as how many triangles are there in the rendered mesh. For better performance, some acceleration structure, such as uniform grid or kd-tree should be employed. Also note, soft shadows require us to cast more shadow rays, which also add additional strain on the ray tracing fragment shader.

See also

How to do it…

Let us start with this

recipe by following these simple steps:

  1. Load the Obj mesh model using the Obj loader and store mesh geometry in vectors. Note that for the GPU ray tracer we use the original vertices and indices lists stored in the OBJ file.
    vector<unsigned short> indices2;
    vector<glm::vec3> vertices2;
    if(!obj.Load(mesh_filename.c_str(), meshes, vertices, indices, materials, aabb, vertices2, indices2)) { 
      cout<<"Cannot load the 3ds mesh"<<endl;
      exit(EXIT_FAILURE);
    }
  2. Load the material texture maps into an OpenGL texture array instead of loading each texture separately, as in previous recipes. We opted for texture arrays because this helps in simplifying the shader code and we would have no way in determining the total samplers we would require, as that is dependent on the material textures we have in the model. In previous recipes, there was a single texture sampler which was modified for each sub-mesh.
    for(size_t k=0;k<materials.size();k++) {
    if(materials[k]->map_Kd != "") { 
    if(k==0) {
      glGenTextures(1, &textureID);
      glBindTexture(GL_TEXTURE_2D_ARRAY, textureID);
      glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
     //set other texture parameters
    }
    //set image name
    GLubyte* pData = SOIL_load_image(full_filename.c_str(), &texture_width, &texture_height, &channels, SOIL_LOAD_AUTO);
     if(pData == NULL) {
      cerr<<"Cannot load image: "<<full_filename.c_str()<<endl;
      exit(EXIT_FAILURE);
     } 
     //flip the image and set the image format
     if(k==0) {
      glTexImage3D(GL_TEXTURE_2D_ARRAY, 0, format, texture_width, texture_height, total, 0, format, GL_UNSIGNED_BYTE, NULL);
     }
      glTexSubImage3D(GL_TEXTURE_2D_ARRAY, 0,0,0,k, texture_width, texture_height, 1, format, GL_UNSIGNED_BYTE, pData);
      SOIL_free_image_data(pData);
     }
    }
  3. Store the vertex positions into a texture for the ray tracing shader. We use a floating point texture with the GL_RGBA32F internal format.
    glGenTextures(1, &texVerticesID);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture( GL_TEXTURE_2D, texVerticesID);
    //set the texture formats
    GLfloat* pData = new GLfloat[vertices2.size()*4]; 
    int count = 0;
    for(size_t i=0;i<vertices2.size();i++) { 
    pData[count++] = vertices2[i].x;
    pData[count++] = vertices2[i].y;
    pData[count++] = vertices2[i].z; 
    pData[count++] = 0; 
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, vertices2.size(),1, 0, GL_RGBA, GL_FLOAT, pData);
    delete [] pData;
  4. Store the list of indices into an integral texture for the ray tracing shader. Note that for this texture, the internal format is GL_RGBA16I and the format is GL_RGBA_INTEGER.
    glGenTextures(1, &texTrianglesID);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture( GL_TEXTURE_2D, texTrianglesID);
    //set the texture formats
    GLushort* pData2 = new GLushort[indices2.size()];
    count = 0;
    for(size_t i=0;i<indices2.size();i+=4) {
      pData2[count++] = (indices2[i]);
      pData2[count++] = (indices2[i+1]);
      pData2[count++] = (indices2[i+2]); 
      pData2[count++] = (indices2[i+3]);
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16I, indices2.size()/4, 1, 0, GL_RGBA_INTEGER, GL_UNSIGNED_SHORT, pData2); 
    delete [] pData2;
    
  5. In the render function, bind the ray tracing shader and then draw a full-screen quad to invoke the fragment shader for the entire screen.

How it works…

The main code for ray tracing is the ray tracing fragment shader (Chapter6/GPURaytracing/shaders/raytracer.frag). We first set up the camera ray origin and direction using the parameters passed to the shader as shader uniforms.

eyeRay.origin = eyePos;
cam.U = (invMVP*vec4(1,0,0,0)).xyz; 
cam.V = (invMVP*vec4(0,1,0,0)).xyz; 
cam.W = (invMVP*vec4(0,0,1,0)).xyz; 
cam.d = 1;
eyeRay.dir = get_direction(uv , cam);
eyeRay.dir += cam.U*uv.x;
eyeRay.dir += cam.V*uv.y;

After the eye ray is set up, we check the ray against the axially aligned bounding box of the scene. If there is an intersection, we continue further. For this simple example, we use a brute force method of looping through all of the triangles and testing each of them in turn for ray intersection.

In ray tracing, we try to find the neatest intersection of a parametric ray with the given triangle. Any point along the ray is obtained by using a parameter t. We are looking for the nearest intersection (smallest t value). If there is an intersection and it is the closest so far, we store the collision information and the normal at the intersection point. The t parameter gives us the exact position where the intersection occurs.

vec4 val=vec4(t,0,0,0);
vec3 N;
for(int i=0;i<int(TRIANGLE_TEXTURE_SIZE);i++)
{
  vec3 normal;
  vec4 res = intersectTriangle(eyeRay.origin, eyeRay.dir, i, normal); 
  if(res.x>0 && res.x <= val.x) { 
    val = res;
    N = normal;
  }
}

When we plug its value into the parametric equation of a ray, we get the hit point. Then, we calculate a vector to light from the hit point. This vector is then used to estimate the diffuse component and the attenuation amount.

if(val.x != t) {
  vec3 hit = eyeRay.origin + eyeRay.dir*val.x;
  vec3  jitteredLight  =  light_position + uniformlyRandomVector(gl_FragCoord.x);
  vec3 L = (jitteredLight.xyz-hit);
  float d = length(L);
  L = normalize(L);
  float diffuse = max(0, dot(N, L));
  float attenuationAmount = 1.0/(k0 + (k1*d) + (k2*d*d));
  diffuse *= attenuationAmount;

With ray tracing, shadows are very easy to calculate. We simply cast another ray, but this time, just look at if the ray intersects any object on its way to the light source. If it does, we darken the final color, otherwise we leave the color as is. Note that to prevent the shadow acne, we add a slight offset to the ray start position.

  float inShadow = shadow(hit+ N*0.0001, L);
  vFragColor = inShadow*diffuse*mix(texture(textureMaps, val.yzw), vec4(1), (val.w==255) );
  return;
}

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU ray tracing by pressing the Space bar. We can see that the shadows are clearly visible in the ray tracing scene. Note that the performance of GPU ray tracing is directly related to how close or far the object is from the camera, as well as how many triangles are there in the rendered mesh. For better performance, some acceleration structure, such as uniform grid or kd-tree should be employed. Also note, soft shadows require us to cast more shadow rays, which also add additional strain on the ray tracing fragment shader.

See also

How it works…

The main code

for ray tracing is the ray tracing fragment shader (Chapter6/GPURaytracing/shaders/raytracer.frag). We first set up the camera ray origin and direction using the parameters passed to the shader as shader uniforms.

eyeRay.origin = eyePos;
cam.U = (invMVP*vec4(1,0,0,0)).xyz; 
cam.V = (invMVP*vec4(0,1,0,0)).xyz; 
cam.W = (invMVP*vec4(0,0,1,0)).xyz; 
cam.d = 1;
eyeRay.dir = get_direction(uv , cam);
eyeRay.dir += cam.U*uv.x;
eyeRay.dir += cam.V*uv.y;

After the eye ray is set up, we check the ray against the axially aligned bounding box of the scene. If there is an intersection, we continue further. For this simple example, we use a brute force method of looping through all of the triangles and testing each of them in turn for ray intersection.

In ray tracing, we try to find the neatest intersection of a parametric ray with the given triangle. Any point along the ray is obtained by using a parameter t. We are looking for the nearest intersection (smallest t value). If there is an intersection and it is the closest so far, we store the collision information and the normal at the intersection point. The t parameter gives us the exact position where the intersection occurs.

vec4 val=vec4(t,0,0,0);
vec3 N;
for(int i=0;i<int(TRIANGLE_TEXTURE_SIZE);i++)
{
  vec3 normal;
  vec4 res = intersectTriangle(eyeRay.origin, eyeRay.dir, i, normal); 
  if(res.x>0 && res.x <= val.x) { 
    val = res;
    N = normal;
  }
}

When we plug its value into the parametric equation of a ray, we get the hit point. Then, we calculate a vector to light from the hit point. This vector is then used to estimate the diffuse component and the attenuation amount.

if(val.x != t) {
  vec3 hit = eyeRay.origin + eyeRay.dir*val.x;
  vec3  jitteredLight  =  light_position + uniformlyRandomVector(gl_FragCoord.x);
  vec3 L = (jitteredLight.xyz-hit);
  float d = length(L);
  L = normalize(L);
  float diffuse = max(0, dot(N, L));
  float attenuationAmount = 1.0/(k0 + (k1*d) + (k2*d*d));
  diffuse *= attenuationAmount;

With ray tracing, shadows are very easy to calculate. We simply cast another ray, but this time, just look at if the ray intersects any object on its way to the light source. If it does, we darken the final color, otherwise we leave the color as is. Note that to prevent the shadow acne, we add a slight offset to the ray start position.

  float inShadow = shadow(hit+ N*0.0001, L);
  vFragColor = inShadow*diffuse*mix(texture(textureMaps, val.yzw), vec4(1), (val.w==255) );
  return;
}

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU ray tracing by pressing the Space bar. We can see that the shadows are clearly visible in the ray tracing scene. Note that the performance of GPU ray tracing is directly related to how close or far the object is from the camera, as well as how many triangles are there in the rendered mesh. For better performance, some acceleration structure, such as uniform grid or kd-tree should be employed. Also note, soft shadows require us to cast more shadow rays, which also add additional strain on the ray tracing fragment shader.

See also

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU ray tracing by pressing the Space bar. We can see that the shadows are clearly visible in the ray tracing scene. Note that the performance of GPU ray tracing is directly related to how close or far the object is from the camera, as well as how many triangles are there in the rendered mesh. For better performance, some acceleration structure, such as uniform grid or kd-tree should be employed. Also note, soft shadows require us to cast more shadow rays, which also add additional strain on the ray tracing fragment shader.

See also

See also

Timothy Purcell, Ian Buck, William R. Mark, and Pat Hanrahan, ACM Transactions on Graphics 21 (3), Ray Tracing on Programmable Graphics Hardware, pages 703-712:

Implementing GPU-based path tracing

In this recipe, we will implement another method, called path tracing, for rendering geometry. Similar to ray tracing, path tracing casts rays but these rays are shot randomly from the light position(s). Since it is usually difficult to approximate real lighting, we can approximate it using Monte Carlo-based integration schemes. These methods use random sampling and if there are enough samples, the integration result converges to the true solution.

We can give the path tracing pseudocode as follows:

For each pixel on screen
  Create a light ray from light position in a random direction 
  For the amount of traces required           
    For each object in the scene
      Check light ray for intersection 
      If intersection found
        Determine the hit point and surface normal 
        Calculate diffuse and specular comp. at hit point
        Cast shadow ray in random direction from hit point  
        Darken diffuse component based on shadow result
        Set the randomly jittered hit point as new ray origin
        Reflect the light ray direction at surface normal
      End If
   End For
  End For
End For

Getting ready

The code for this recipe is contained in the Chapter6/GPUPathtracing directory.

How to do it…

Let us start with this recipe by following these simple steps:

  1. Load the Obj mesh model using the Obj loader and store the mesh geometry in vectors. Note that for the GPU path tracer we use the original vertices and indices lists stored in the OBJ file, as in the previous recipe.
  2. Load the material texture maps into an OpenGL texture array instead of loading each texture separately as in the previous recipe.
  3. Store the vertex positions into a texture for the path tracing shader, similar to how we stored them for ray tracing in the previous recipe. We use a floating point texture with the GL_RGBA32F internal format.
    glGenTextures(1, &texVerticesID);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture( GL_TEXTURE_2D, texVerticesID);
    //set the texture formats
    GLfloat* pData = new GLfloat[vertices2.size()*4];
    int count = 0;
    for(size_t i=0;i<vertices2.size();i++) { 
      pData[count++] = vertices2[i].x;
      pData[count++] = vertices2[i].y;
      pData[count++] = vertices2[i].z; 
      pData[count++] = 0;
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, vertices2.size(),1, 0, GL_RGBA, GL_FLOAT, pData);
    delete [] pData;
  4. Store the indices list into an integral texture for the path tracing shader, as was done for the ray tracing recipe. Note that for this texture, the internal format is GL_RGBA16I and format is GL_RGBA_INTEGER.
    glGenTextures(1, &texTrianglesID);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture( GL_TEXTURE_2D, texTrianglesID);
    //set the texture formats
    GLushort* pData2 = new GLushort[indices2.size()];
    count = 0;
    for(size_t i=0;i<indices2.size();i+=4) {
      pData2[count++] = (indices2[i]);
      pData2[count++] = (indices2[i+1]);
      pData2[count++] = (indices2[i+2]); 
      pData2[count++] = (indices2[i+3]);
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16I, indices2.size()/4, 1, 0, GL_RGBA_INTEGER, GL_UNSIGNED_SHORT, pData2);
    delete [] pData2;
    
  5. In the render function, bind the path tracing shader and then draw a full-screen quad to invoke the fragment shader for the entire screen.
    pathtraceShader.Use();	
      glUniform3fv(pathtraceShader("eyePos"), 1, glm::value_ptr(eyePos));
      glUniform1f(pathtraceShader("time"), current);
      glUniform3fv(pathtraceShader("light_position"),1, &(lightPosOS.x));
      glUniformMatrix4fv(pathtraceShader("invMVP"), 1, GL_FALSE, glm::value_ptr(invMVP));
      DrawFullScreenQuad();
    pathtraceShader.UnUse();

How it works…

The main code for path tracing is carried out in the path tracing fragment shader (Chapter6/GPUPathtracing/shaders/pathtracer.frag). We first set up the camera ray origin and direction using the parameters passed to the shader as shader uniforms.

eyeRay.origin = eyePos;
cam.U = (invMVP*vec4(1,0,0,0)).xyz;
cam.V = (invMVP*vec4(0,1,0,0)).xyz;
cam.W = (invMVP*vec4(0,0,1,0)).xyz;
cam.d = 1;
eyeRay.dir = get_direction(uv , cam);
eyeRay.dir += cam.U*uv.x;
eyeRay.dir += cam.V*uv.y;

After the eye ray is set up, we check the ray against the scene's axially aligned bounding box. If there is an intersection, we call our path trace function.

vec2 tNearFar = intersectCube(eyeRay.origin, eyeRay.dir,  aabb);
if(tNearFar.x<tNearFar.y  ) {
  t = tNearFar.y+1; 
  vec3 light = light_position + uniformlyRandomVector(time) * 0.1;
  vFragColor = vec4(pathtrace(eyeRay.origin, eyeRay.dir, light, t),1);
}

In the path trace function, we run a loop that iterates for a number of passes. In each pass, we check the scene geometry for an intersection with the ray. We use a brute force method of looping through all of the triangles and testing each of them in turn for collision. If we have an intersection, we check to see if this is the nearest intersection. If it is, we store the normal and the texture coordinates at the intersection point.

for(int bounce = 0; bounce < MAX_BOUNCES; bounce++) {
  vec2 tNearFar = intersectCube(origin, ray,  aabb);
  if(  tNearFar.x > tNearFar.y)
    continue;
  if(tNearFar.y<t)
    t = tNearFar.y+1;
  vec3 N;
  vec4 val=vec4(t,0,0,0);
  for(int i=0;i<int(TRIANGLE_TEXTURE_SIZE);i++)
  {
    vec3 normal;
    vec4 res = intersectTriangle(origin, ray, i, normal);
    if(res.x>0.001 && res.x <  val.x) {
      val = res;
      N = normal;
    }
  }

We then check the t parameter value to find the nearest intersection and then use the texture array to sample the appropriate texture for the output color value for the current fragment. We then change the current ray origin to the current hit point and then change the current ray direction to a uniform random direction in the hemisphere above the intersection point.

if(val.x < t) {
  surfaceColor = mix(texture(textureMaps, val.yzw), vec4(1), (val.w==255) ).xyz;
  vec3 hit = origin + ray * val.x;
  origin = hit;
  ray = uniformlyRandomDirection(time + float(bounce));

The diffuse component is then estimated and then the color is accumulated. At the end of the loop, the final accumulated color is returned.

  vec3  jitteredLight  =  light + ray;
  vec3 L = normalize(jitteredLight - hit);
  diffuse = max(0.0, dot(L, N));
  colorMask *= surfaceColor;	
  float inShadow = shadow(hit+ N*0.0001, L);
  accumulatedColor += colorMask * diffuse * inShadow; 
  t = val.x;
}
  }
  if(accumulatedColor == vec3(0))
    return surfaceColor*diffuse;
  else
    return accumulatedColor/float(MAX_BOUNCES-1);}

Note that the path tracing output is noisy and a large number of samples are needed to get a less noisy result.

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU path tracing by pressing the Space bar, as shown below:

Note that the performance of GPU path tracing is directly related to how close or far the object is from camera, as well as how many triangles are there in the rendered mesh. In order to reduce the amount of testing, some acceleration structure, such as uniform grid or kd-tree should be employed. In addition, since the results obtained from path tracing are generally noisier as compared to the ray tracing results, noise removal filters, such as Gaussian smoothing could be carried out on the path traced result.

Ray tracing is poor at approximating global illumination and soft shadows. Path tracing, on the other hand, handles global illumination and soft shadows well, but it suffers from noise. To get a good result, it requires a large number of random sampling points. There are other techniques, such as Metropolis light transport, which uses heuristics to only accept good sample points and reject bad sampling points. As a result, it converges to a less noisier result faster as compared to naïve path tracing.

See also

Getting ready

The code for this recipe is contained in the Chapter6/GPUPathtracing directory.

How to do it…

Let us start with this recipe by following these simple steps:

  1. Load the Obj mesh model using the Obj loader and store the mesh geometry in vectors. Note that for the GPU path tracer we use the original vertices and indices lists stored in the OBJ file, as in the previous recipe.
  2. Load the material texture maps into an OpenGL texture array instead of loading each texture separately as in the previous recipe.
  3. Store the vertex positions into a texture for the path tracing shader, similar to how we stored them for ray tracing in the previous recipe. We use a floating point texture with the GL_RGBA32F internal format.
    glGenTextures(1, &texVerticesID);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture( GL_TEXTURE_2D, texVerticesID);
    //set the texture formats
    GLfloat* pData = new GLfloat[vertices2.size()*4];
    int count = 0;
    for(size_t i=0;i<vertices2.size();i++) { 
      pData[count++] = vertices2[i].x;
      pData[count++] = vertices2[i].y;
      pData[count++] = vertices2[i].z; 
      pData[count++] = 0;
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, vertices2.size(),1, 0, GL_RGBA, GL_FLOAT, pData);
    delete [] pData;
  4. Store the indices list into an integral texture for the path tracing shader, as was done for the ray tracing recipe. Note that for this texture, the internal format is GL_RGBA16I and format is GL_RGBA_INTEGER.
    glGenTextures(1, &texTrianglesID);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture( GL_TEXTURE_2D, texTrianglesID);
    //set the texture formats
    GLushort* pData2 = new GLushort[indices2.size()];
    count = 0;
    for(size_t i=0;i<indices2.size();i+=4) {
      pData2[count++] = (indices2[i]);
      pData2[count++] = (indices2[i+1]);
      pData2[count++] = (indices2[i+2]); 
      pData2[count++] = (indices2[i+3]);
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16I, indices2.size()/4, 1, 0, GL_RGBA_INTEGER, GL_UNSIGNED_SHORT, pData2);
    delete [] pData2;
    
  5. In the render function, bind the path tracing shader and then draw a full-screen quad to invoke the fragment shader for the entire screen.
    pathtraceShader.Use();	
      glUniform3fv(pathtraceShader("eyePos"), 1, glm::value_ptr(eyePos));
      glUniform1f(pathtraceShader("time"), current);
      glUniform3fv(pathtraceShader("light_position"),1, &(lightPosOS.x));
      glUniformMatrix4fv(pathtraceShader("invMVP"), 1, GL_FALSE, glm::value_ptr(invMVP));
      DrawFullScreenQuad();
    pathtraceShader.UnUse();

How it works…

The main code for path tracing is carried out in the path tracing fragment shader (Chapter6/GPUPathtracing/shaders/pathtracer.frag). We first set up the camera ray origin and direction using the parameters passed to the shader as shader uniforms.

eyeRay.origin = eyePos;
cam.U = (invMVP*vec4(1,0,0,0)).xyz;
cam.V = (invMVP*vec4(0,1,0,0)).xyz;
cam.W = (invMVP*vec4(0,0,1,0)).xyz;
cam.d = 1;
eyeRay.dir = get_direction(uv , cam);
eyeRay.dir += cam.U*uv.x;
eyeRay.dir += cam.V*uv.y;

After the eye ray is set up, we check the ray against the scene's axially aligned bounding box. If there is an intersection, we call our path trace function.

vec2 tNearFar = intersectCube(eyeRay.origin, eyeRay.dir,  aabb);
if(tNearFar.x<tNearFar.y  ) {
  t = tNearFar.y+1; 
  vec3 light = light_position + uniformlyRandomVector(time) * 0.1;
  vFragColor = vec4(pathtrace(eyeRay.origin, eyeRay.dir, light, t),1);
}

In the path trace function, we run a loop that iterates for a number of passes. In each pass, we check the scene geometry for an intersection with the ray. We use a brute force method of looping through all of the triangles and testing each of them in turn for collision. If we have an intersection, we check to see if this is the nearest intersection. If it is, we store the normal and the texture coordinates at the intersection point.

for(int bounce = 0; bounce < MAX_BOUNCES; bounce++) {
  vec2 tNearFar = intersectCube(origin, ray,  aabb);
  if(  tNearFar.x > tNearFar.y)
    continue;
  if(tNearFar.y<t)
    t = tNearFar.y+1;
  vec3 N;
  vec4 val=vec4(t,0,0,0);
  for(int i=0;i<int(TRIANGLE_TEXTURE_SIZE);i++)
  {
    vec3 normal;
    vec4 res = intersectTriangle(origin, ray, i, normal);
    if(res.x>0.001 && res.x <  val.x) {
      val = res;
      N = normal;
    }
  }

We then check the t parameter value to find the nearest intersection and then use the texture array to sample the appropriate texture for the output color value for the current fragment. We then change the current ray origin to the current hit point and then change the current ray direction to a uniform random direction in the hemisphere above the intersection point.

if(val.x < t) {
  surfaceColor = mix(texture(textureMaps, val.yzw), vec4(1), (val.w==255) ).xyz;
  vec3 hit = origin + ray * val.x;
  origin = hit;
  ray = uniformlyRandomDirection(time + float(bounce));

The diffuse component is then estimated and then the color is accumulated. At the end of the loop, the final accumulated color is returned.

  vec3  jitteredLight  =  light + ray;
  vec3 L = normalize(jitteredLight - hit);
  diffuse = max(0.0, dot(L, N));
  colorMask *= surfaceColor;	
  float inShadow = shadow(hit+ N*0.0001, L);
  accumulatedColor += colorMask * diffuse * inShadow; 
  t = val.x;
}
  }
  if(accumulatedColor == vec3(0))
    return surfaceColor*diffuse;
  else
    return accumulatedColor/float(MAX_BOUNCES-1);}

Note that the path tracing output is noisy and a large number of samples are needed to get a less noisy result.

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU path tracing by pressing the Space bar, as shown below:

Note that the performance of GPU path tracing is directly related to how close or far the object is from camera, as well as how many triangles are there in the rendered mesh. In order to reduce the amount of testing, some acceleration structure, such as uniform grid or kd-tree should be employed. In addition, since the results obtained from path tracing are generally noisier as compared to the ray tracing results, noise removal filters, such as Gaussian smoothing could be carried out on the path traced result.

Ray tracing is poor at approximating global illumination and soft shadows. Path tracing, on the other hand, handles global illumination and soft shadows well, but it suffers from noise. To get a good result, it requires a large number of random sampling points. There are other techniques, such as Metropolis light transport, which uses heuristics to only accept good sample points and reject bad sampling points. As a result, it converges to a less noisier result faster as compared to naïve path tracing.

See also

How to do it…

Let us start with

this recipe by following these simple steps:

  1. Load the Obj mesh model using the Obj loader and store the mesh geometry in vectors. Note that for the GPU path tracer we use the original vertices and indices lists stored in the OBJ file, as in the previous recipe.
  2. Load the material texture maps into an OpenGL texture array instead of loading each texture separately as in the previous recipe.
  3. Store the vertex positions into a texture for the path tracing shader, similar to how we stored them for ray tracing in the previous recipe. We use a floating point texture with the GL_RGBA32F internal format.
    glGenTextures(1, &texVerticesID);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture( GL_TEXTURE_2D, texVerticesID);
    //set the texture formats
    GLfloat* pData = new GLfloat[vertices2.size()*4];
    int count = 0;
    for(size_t i=0;i<vertices2.size();i++) { 
      pData[count++] = vertices2[i].x;
      pData[count++] = vertices2[i].y;
      pData[count++] = vertices2[i].z; 
      pData[count++] = 0;
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, vertices2.size(),1, 0, GL_RGBA, GL_FLOAT, pData);
    delete [] pData;
  4. Store the indices list into an integral texture for the path tracing shader, as was done for the ray tracing recipe. Note that for this texture, the internal format is GL_RGBA16I and format is GL_RGBA_INTEGER.
    glGenTextures(1, &texTrianglesID);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture( GL_TEXTURE_2D, texTrianglesID);
    //set the texture formats
    GLushort* pData2 = new GLushort[indices2.size()];
    count = 0;
    for(size_t i=0;i<indices2.size();i+=4) {
      pData2[count++] = (indices2[i]);
      pData2[count++] = (indices2[i+1]);
      pData2[count++] = (indices2[i+2]); 
      pData2[count++] = (indices2[i+3]);
    }
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16I, indices2.size()/4, 1, 0, GL_RGBA_INTEGER, GL_UNSIGNED_SHORT, pData2);
    delete [] pData2;
    
  5. In the render function, bind the path tracing shader and then draw a full-screen quad to invoke the fragment shader for the entire screen.
    pathtraceShader.Use();	
      glUniform3fv(pathtraceShader("eyePos"), 1, glm::value_ptr(eyePos));
      glUniform1f(pathtraceShader("time"), current);
      glUniform3fv(pathtraceShader("light_position"),1, &(lightPosOS.x));
      glUniformMatrix4fv(pathtraceShader("invMVP"), 1, GL_FALSE, glm::value_ptr(invMVP));
      DrawFullScreenQuad();
    pathtraceShader.UnUse();

How it works…

The main code for path tracing is carried out in the path tracing fragment shader (Chapter6/GPUPathtracing/shaders/pathtracer.frag). We first set up the camera ray origin and direction using the parameters passed to the shader as shader uniforms.

eyeRay.origin = eyePos;
cam.U = (invMVP*vec4(1,0,0,0)).xyz;
cam.V = (invMVP*vec4(0,1,0,0)).xyz;
cam.W = (invMVP*vec4(0,0,1,0)).xyz;
cam.d = 1;
eyeRay.dir = get_direction(uv , cam);
eyeRay.dir += cam.U*uv.x;
eyeRay.dir += cam.V*uv.y;

After the eye ray is set up, we check the ray against the scene's axially aligned bounding box. If there is an intersection, we call our path trace function.

vec2 tNearFar = intersectCube(eyeRay.origin, eyeRay.dir,  aabb);
if(tNearFar.x<tNearFar.y  ) {
  t = tNearFar.y+1; 
  vec3 light = light_position + uniformlyRandomVector(time) * 0.1;
  vFragColor = vec4(pathtrace(eyeRay.origin, eyeRay.dir, light, t),1);
}

In the path trace function, we run a loop that iterates for a number of passes. In each pass, we check the scene geometry for an intersection with the ray. We use a brute force method of looping through all of the triangles and testing each of them in turn for collision. If we have an intersection, we check to see if this is the nearest intersection. If it is, we store the normal and the texture coordinates at the intersection point.

for(int bounce = 0; bounce < MAX_BOUNCES; bounce++) {
  vec2 tNearFar = intersectCube(origin, ray,  aabb);
  if(  tNearFar.x > tNearFar.y)
    continue;
  if(tNearFar.y<t)
    t = tNearFar.y+1;
  vec3 N;
  vec4 val=vec4(t,0,0,0);
  for(int i=0;i<int(TRIANGLE_TEXTURE_SIZE);i++)
  {
    vec3 normal;
    vec4 res = intersectTriangle(origin, ray, i, normal);
    if(res.x>0.001 && res.x <  val.x) {
      val = res;
      N = normal;
    }
  }

We then check the t parameter value to find the nearest intersection and then use the texture array to sample the appropriate texture for the output color value for the current fragment. We then change the current ray origin to the current hit point and then change the current ray direction to a uniform random direction in the hemisphere above the intersection point.

if(val.x < t) {
  surfaceColor = mix(texture(textureMaps, val.yzw), vec4(1), (val.w==255) ).xyz;
  vec3 hit = origin + ray * val.x;
  origin = hit;
  ray = uniformlyRandomDirection(time + float(bounce));

The diffuse component is then estimated and then the color is accumulated. At the end of the loop, the final accumulated color is returned.

  vec3  jitteredLight  =  light + ray;
  vec3 L = normalize(jitteredLight - hit);
  diffuse = max(0.0, dot(L, N));
  colorMask *= surfaceColor;	
  float inShadow = shadow(hit+ N*0.0001, L);
  accumulatedColor += colorMask * diffuse * inShadow; 
  t = val.x;
}
  }
  if(accumulatedColor == vec3(0))
    return surfaceColor*diffuse;
  else
    return accumulatedColor/float(MAX_BOUNCES-1);}

Note that the path tracing output is noisy and a large number of samples are needed to get a less noisy result.

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU path tracing by pressing the Space bar, as shown below:

Note that the performance of GPU path tracing is directly related to how close or far the object is from camera, as well as how many triangles are there in the rendered mesh. In order to reduce the amount of testing, some acceleration structure, such as uniform grid or kd-tree should be employed. In addition, since the results obtained from path tracing are generally noisier as compared to the ray tracing results, noise removal filters, such as Gaussian smoothing could be carried out on the path traced result.

Ray tracing is poor at approximating global illumination and soft shadows. Path tracing, on the other hand, handles global illumination and soft shadows well, but it suffers from noise. To get a good result, it requires a large number of random sampling points. There are other techniques, such as Metropolis light transport, which uses heuristics to only accept good sample points and reject bad sampling points. As a result, it converges to a less noisier result faster as compared to naïve path tracing.

See also

How it works…

The main code

for path tracing is carried out in the path tracing fragment shader (Chapter6/GPUPathtracing/shaders/pathtracer.frag). We first set up the camera ray origin and direction using the parameters passed to the shader as shader uniforms.

eyeRay.origin = eyePos;
cam.U = (invMVP*vec4(1,0,0,0)).xyz;
cam.V = (invMVP*vec4(0,1,0,0)).xyz;
cam.W = (invMVP*vec4(0,0,1,0)).xyz;
cam.d = 1;
eyeRay.dir = get_direction(uv , cam);
eyeRay.dir += cam.U*uv.x;
eyeRay.dir += cam.V*uv.y;

After the eye ray is set up, we check the ray against the scene's axially aligned bounding box. If there is an intersection, we call our path trace function.

vec2 tNearFar = intersectCube(eyeRay.origin, eyeRay.dir,  aabb);
if(tNearFar.x<tNearFar.y  ) {
  t = tNearFar.y+1; 
  vec3 light = light_position + uniformlyRandomVector(time) * 0.1;
  vFragColor = vec4(pathtrace(eyeRay.origin, eyeRay.dir, light, t),1);
}

In the path trace function, we run a loop that iterates for a number of passes. In each pass, we check the scene geometry for an intersection with the ray. We use a brute force method of looping through all of the triangles and testing each of them in turn for collision. If we have an intersection, we check to see if this is the nearest intersection. If it is, we store the normal and the texture coordinates at the intersection point.

for(int bounce = 0; bounce < MAX_BOUNCES; bounce++) {
  vec2 tNearFar = intersectCube(origin, ray,  aabb);
  if(  tNearFar.x > tNearFar.y)
    continue;
  if(tNearFar.y<t)
    t = tNearFar.y+1;
  vec3 N;
  vec4 val=vec4(t,0,0,0);
  for(int i=0;i<int(TRIANGLE_TEXTURE_SIZE);i++)
  {
    vec3 normal;
    vec4 res = intersectTriangle(origin, ray, i, normal);
    if(res.x>0.001 && res.x <  val.x) {
      val = res;
      N = normal;
    }
  }

We then check the t parameter value to find the nearest intersection and then use the texture array to sample the appropriate texture for the output color value for the current fragment. We then change the current ray origin to the current hit point and then change the current ray direction to a uniform random direction in the hemisphere above the intersection point.

if(val.x < t) {
  surfaceColor = mix(texture(textureMaps, val.yzw), vec4(1), (val.w==255) ).xyz;
  vec3 hit = origin + ray * val.x;
  origin = hit;
  ray = uniformlyRandomDirection(time + float(bounce));

The diffuse component is then estimated and then the color is accumulated. At the end of the loop, the final accumulated color is returned.

  vec3  jitteredLight  =  light + ray;
  vec3 L = normalize(jitteredLight - hit);
  diffuse = max(0.0, dot(L, N));
  colorMask *= surfaceColor;	
  float inShadow = shadow(hit+ N*0.0001, L);
  accumulatedColor += colorMask * diffuse * inShadow; 
  t = val.x;
}
  }
  if(accumulatedColor == vec3(0))
    return surfaceColor*diffuse;
  else
    return accumulatedColor/float(MAX_BOUNCES-1);}

Note that the path tracing output is noisy and a large number of samples are needed to get a less noisy result.

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU path tracing by pressing the Space bar, as shown below:

Note that the performance of GPU path tracing is directly related to how close or far the object is from camera, as well as how many triangles are there in the rendered mesh. In order to reduce the amount of testing, some acceleration structure, such as uniform grid or kd-tree should be employed. In addition, since the results obtained from path tracing are generally noisier as compared to the ray tracing results, noise removal filters, such as Gaussian smoothing could be carried out on the path traced result.

Ray tracing is poor at approximating global illumination and soft shadows. Path tracing, on the other hand, handles global illumination and soft shadows well, but it suffers from noise. To get a good result, it requires a large number of random sampling points. There are other techniques, such as Metropolis light transport, which uses heuristics to only accept good sample points and reject bad sampling points. As a result, it converges to a less noisier result faster as compared to naïve path tracing.

See also

There's more…

The demo application for this recipe renders the same scene as in previous recipes. The scene can be toggled between rasterization and GPU path tracing by pressing the Space bar, as shown below:

Note that the performance of GPU path tracing is directly related to how close or far the object is from camera, as well as how many triangles are there in the rendered mesh. In order to reduce the amount of testing, some acceleration structure, such as uniform grid or kd-tree should be employed. In addition, since the results obtained from path tracing are generally noisier as compared to the ray tracing results, noise removal filters, such as Gaussian smoothing could be carried out on the path traced result.

Ray tracing is

poor at approximating global illumination and soft shadows. Path tracing, on the other hand, handles global illumination and soft shadows well, but it suffers from noise. To get a good result, it requires a large number of random sampling points. There are other techniques, such as Metropolis light transport, which uses heuristics to only accept good sample points and reject bad sampling points. As a result, it converges to a less noisier result faster as compared to naïve path tracing.

See also

See also

Tim Purcell, Ian Buck, William Mark, and Pat Hanrahan, Ray Tracing on Programmable Graphics Hardware, ACM Transactions on Graphics 21(3), pp: 703-712, 2002. Available online:
You have been reading a chapter from
OpenGL ??? Build high performance graphics
Published in: May 2017
Publisher: Packt
ISBN-13: 9781788296724
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image