Implementing skeletal animation using matrix palette skinning

When working with games and simulation systems, virtual characters are often used to give a detailed depiction of scenarios. Such characters are typically represented using a combination of bones and skin. The vertices of the 3D model are assigned influence weights (called blend weights) that control how much a bone influences that vertex. Up to four bones can influence a vertex. The process whereby bone weights are assigned to the vertices of a 3D model is called skinning. Each bone stores its transformation. These stored sequences of transformations are applied to every frame and every bone in the model and in the end, we get an animated character on the screen. This representation of animation is called skeletal animation. There are several methods for skeletal animation. One popular method is matrix palette skinning, which is also known as linear blend skinning (LBS). This method will be implemented in this recipe.

Getting ready

The code for this recipe is contained in the Chapter8/MatrixPaletteSkinning directory. This recipe will be using the Implementing EZMesh model loading recipe from Chapter 5, Mesh Model Formats and Particle Systems and it will augment it with skeletal animation. The EZMesh format was developed by John Ratcliff and it is an easy-to-understand format for storing skeletal animation. Typical skeletal animation formats like COLLADA and FBX are needlessly complicated, where dozens of segments have to be parsed before the real content can be loaded. On the other hand, the EZMesh format stores all of the information in an XML-based format, which is easier to parse. It is the default skeletal animation format used in the NVIDIA PhysX sdk. More information about the EZMesh model format and loaders can be obtained from the references in the See also section of this recipe.

How to do it…

Let us start our recipe by following these simple steps:

Load the EZMesh model as we did in the Implementing EZMesh loader recipe from Chapter 5, Mesh Model Formats and Particle System. In addition to the model submeshes, vertices, normals, texture coordinates, and materials, we also load the skeleton information from the EZMesh file.
```
EzmLoader ezm;
if(!ezm.Load(mesh_filename.c_str(), skeleton, animations,  
             submeshes, vertices, indices, material2ImageMap,  
             min, max)) {
  cout<<"Cannot load the EZMesh file"<<endl;
  exit(EXIT_FAILURE);
} 
```

Get the MeshSystem object from the meshImportLibrary object. Then load the bone transformations contained in the EZMesh file using the MeshSystem::mSkeletons array. This is carried out in the EzmLoader::Load function. Also generate absolute bone transforms from the relative transforms. This is done so that the transform of the child bone is influenced by the transform of the parent bone. This is continued up the hierarchy until the root bone. If the mesh is modeled in a positive Z axis system, we need to modify the orientation, positions, and scale by swapping Y and Z axes and changing the sign of one of them. This is done because we are using a positive Y axis system in OpenGL; otherwise, our mesh will be lying in the XZ plane rather than the XY plane. We obtain a combined matrix from the position orientation and scale of the bone. This is stored in the xform field, which is the relative transform of the bone.

if(ms->mSkeletonCount>0) {
  NVSHARE::MeshSkeleton* pSkel = ms->mSkeletons[0];
  Bone b;
  for(int i=0;i<pSkel->GetBoneCount();i++) {
      const NVSHARE::MeshBone pBone = pSkel->mBones[i];
      const int s = strlen(pBone.mName);
      b.name = new char[s+1];
      memset(b.name, 0, sizeof(char)*(s+1));
      strncpy_s(b.name,sizeof(char)*(s+1), pBone.mName, s);
      b.orientation = glm::quat( 
      pBone.mOrientation[3],pBone.mOrientation[0],  
      pBone.mOrientation[1],pBone.mOrientation[2]);
      b.position = glm::vec3( pBone.mPosition[0], 
      pBone.mPosition[1],pBone.mPosition[2]);
      b.scale  = glm::vec3(pBone.mScale[0], pBone.mScale[1],   
      pBone.mScale[2]);

      if(!bYup) {
         float tmp = b.position.y;
         b.position.y = b.position.z;
         b.position.z = -tmp;
         tmp = b.orientation.y;
         b.orientation.y = b.orientation.z;
         b.orientation.z = -tmp;
         tmp = b.scale.y;
         b.scale.y = b.scale.z;
         
b.scale.z = -tmp;
      }      

      glm::mat4 S = glm::scale(glm::mat4(1), b.scale);
      glm::mat4 R = glm::toMat4(b.orientation);              
      glm::mat4 T = glm::translate(glm::mat4(1), b.position);

      b.xform = T*R*S;         
      b.parent = pBone.mParentIndex;   skeleton.push_back(b);
   }
   UpdateCombinedMatrices();
   bindPose.resize(skeleton.size()); 
   invBindPose.resize(skeleton.size());
   animatedXform.resize(skeleton.size());

Generate the bind pose and inverse bind pose arrays from the stored bone transformations:

   for(size_t i=0;i<skeleton.size();i++) {
      bindPose[i] = (skeleton[i].comb);              
      invBindPose[i] = glm::inverse(bindPose[i]);
   }}

Store the blend weights and blend indices of each vertex in the mesh:

 mesh.vertices[j].blendWeights.x = pMesh->mVertices[j].mWeight[0];
 mesh.vertices[j].blendWeights.y = pMesh->mVertices[j].mWeight[1];
 mesh.vertices[j].blendWeights.z = pMesh->mVertices[j].mWeight[2];
 mesh.vertices[j].blendWeights.w = pMesh->mVertices[j].mWeight[3];
 mesh.vertices[j].blendIndices[0] = pMesh->mVertices[j].mBone[0];
 mesh.vertices[j].blendIndices[1] = pMesh->mVertices[j].mBone[1];
 mesh.vertices[j].blendIndices[2] = pMesh->mVertices[j].mBone[2];
 mesh.vertices[j].blendIndices[3] = pMesh->mVertices[j].mBone[3];

In the idle callback function, calculate the amount of time to spend on the current frame. If the amount has elapsed, move to the next frame and reset the time. After this, calculate the new bone transformations as well as new skinning matrices, and pass them to the shader:

   QueryPerformanceCounter(&current);
   dt = (double)(current.QuadPart - last.QuadPart) /  
   (double)freq.QuadPart;
   last = current;
   static double t  = 0;
   t+=dt;
   NVSHARE::MeshAnimation* pAnim = &animations[0];
   float framesPerSecond = pAnim->GetFrameCount()/
                           pAnim->GetDuration();
   if( t > 1.0f/ framesPerSecond) {
      currentFrame++; 
      t=0;
   }  
   if(bLoop) {
      currentFrame = currentFrame%pAnim->mFrameCount;
   } else {
      currentFrame=max(-1,min(currentFrame,pAnim->mFrameCount-1));
   }
  if(currentFrame == -1) {
      for(size_t i=0;i<skeleton.size();i++) {
         skeleton[i].comb = bindPose[i];          
         animatedXform[i] = skeleton[i].comb*invBindPose[i];
      }
   } 
   else {
      for(int j=0;j<pAnim->mTrackCount;j++) {
         NVSHARE::MeshAnimTrack* pTrack = pAnim->mTracks[j]; 
         NVSHARE::MeshAnimPose* pPose = 
         pTrack->GetPose(currentFrame);
         skeleton[j].position.x = pPose->mPos[0];
         skeleton[j].position.y = pPose->mPos[1];
         skeleton[j].position.z = pPose->mPos[2];

         glm::quat q;
         q.x = pPose->mQuat[0];
         q.y = pPose->mQuat[1];
         q.z = pPose->mQuat[2];
         q.w = pPose->mQuat[3];

         skeleton[j].scale  = glm::vec3(pPose->mScale[0], 
                                        pPose->mScale[1], 
                                        pPose->mScale[2]);
         if(!bYup) {
            skeleton[j].position.y = pPose->mPos[2];
            skeleton[j].position.z = -pPose->mPos[1];
            q.y = pPose->mQuat[2];
            q.z = -pPose->mQuat[1];
            skeleton[j].scale.y = pPose->mScale[2];
            skeleton[j].scale.z = -pPose->mScale[1];
         }

         skeleton[j].orientation = q;

         glm::mat4 S =glm::scale(glm::mat4(1),skeleton[j].scale);
         glm::mat4 R = glm::toMat4(q);
         glm::mat4 T = glm::translate(glm::mat4(1), skeleton[j].position);

         skeleton[j].xform = T*R*S;
         Bone& b = skeleton[j];

         if(b.parent==-1)
            b.comb = b.xform;
         else
            b.comb = skeleton[b.parent].comb * b.xform;

         animatedXform[j] = b.comb * invBindPose[j] ;
      }
   }
   shader.Use();
      glUniformMatrix4fv(shader("Bones"),animatedXform.size(), 
      GL_FALSE,glm::value_ptr(animatedXform[0]));
   shader.UnUse();
   shader.UnUse();

How it works…

There are two parts of this recipe: generation of skinning matrices and the calculation of GPU skinning in the vertex shader. To understand the first step, we will start with the different transforms that will be used in skinning. Typically, in a simulation or game, a transform is represented as a 4×4 matrix. For skeletal animation, we have a collection of bones. Each bone has a local transform (also called relative transform), which tells how the bone is positioned and oriented with respect to its parent bone. If the bone's local transform is multiplied to the global transform of its parent, we get the global transform (also called absolute transform) of the bone. Typically, the animation formats store the local transforms of the bones in the file. The user application uses this information to generate the global transforms.

We define our bone structure as follows:

struct Bone {
   glm::quat orientation;
   glm::vec3 position;
   glm::mat4 xform, comb;
   glm::vec3 scale;
   char* name; 
   int parent;
};

The first field is orientation, which is a quaternion storing the orientation of bone in space relative to its parent. The position field stores its position relative to its parent. The xform field is the local (relative) transform, and the comb field is the global (absolute) transform. The scale field contains the scaling transformation of the bone. In the big picture, the scale field gives the scaling matrix (S), the orientation field gives the rotation matrix (R), and the position field gives the translation matrix (T). The combined matrix T*R*S gives us the relative transform that is calculated when we load the bone information from the EZMesh file in the second step.

The name field is the unique name of the bone in the skeleton. Finally, the parent field stores the index of the parent of the current bone in the skeleton array. For the root bone, the parent is -1. For all of the other bones, it will be a number starting from 0 to N-1, where N is the total number of bones in the skeleton.

After we have loaded and stored the relative transforms of each bone in the skeleton, we iterate through each bone to obtain its absolute transform. This is carried out in the UpdateCombinedMatrices function in Chapter8/MatrixPaletteSkinning/main.cpp.

for(size_t i=0;i<skeleton.size();i++) {
   Bone& b = skeleton[i];
   if(b.parent==-1)
      b.comb = b.xform;
   else
      b.comb = skeleton[b.parent].comb * b.xform;
}

After generating the absolute transforms of each bone, we store the bind pose and inverse bind pose matrices of the skeleton. Simply put, bind pose is the absolute transforms of the bones in the non-animated state (that is, when no animation is applied). This is usually when the skeleton is attached (skinned) to the geometry. In other words, it is the default pose of the skeletal animated mesh. Typically, bones can be in any bind pose (usually, for humanoid characters, the character may be in A pose, T pose, and so on based on the convention used). Typically, the inverse bind pose is stored at the time of initialization. So, continuing to the previous skeleton example, we can get the bind pose and inverse bind pose matrices as follows:

for(size_t i=0;i < skeleton.size(); i++) {
   bindPose[i] = skeleton[i].comb;                       
   invBindPose[i] = glm::inverse(bindPose[i]);
}

Note that we do this once at initialization, so that we do not have to calculate the bind pose's inverse every frame, as it is required during animation.

When we apply any new transformation (an animation sequence for example) to the skeleton, we have to first undo the bind pose transformation. This is done by multiplying the animated transformation with the inverse of the bind pose transformation. This is required because the given relative transformations will add to the existing transformations of the bone and, if the bind pose transformation is not undone, the animation output will be wrong.

The final matrix that we get from this process is called the skinning matrix (also called the final bone matrix). Continuing from the example given in the previous paragraph, let's say we have modified the relative transforms of bone using the animation sequence. We can then generate the skinning matrix as follows:

for(size_t i=0;i < skeleton.size(); i++) {
   Bone& b = skeleton[i];
   if(b.parent==-1)
      b.comb = b.xform;
   else
      b.comb = skeleton[b.parent].comb * b.xform;
   animatedXForm[i] = b.comb*invBindPose[i];
}

One thing to note here is the order of the different matrices. As you can see, we right multiply the inverse bind pose matrix with the combined bone transform. We put it this way because OpenGL and glm matrices work right to left. So the inverse of bind pose matrix will be multiplied by the given vertex first. Then it will be multiplied with the local transform (xform) of the current bone. Finally, it will be multiplied with the global transform (comb) of its parent to get the final transformation matrix.

After we have calculated the skinning matrices, we pass these to the GPU in a single call:

shader.Use();
   glUniformMatrix4fv(shader("Bones"), animatedXForm.size(), 
   GL_FALSE, glm::value_ptr(animatedXForm[0]));       
shader.UnUse();

To make sure that the size of the bones array is correct in the vertex shader, we append text to the shader dynamically by using the overloaded GLSLShader::LoadFromFile function.

stringstream str( ios_base::app | ios_base::out);
str<<"\nconst int NUM_BONES="<<skeleton.size()<<";"<<endl;
str<<"uniform mat4 Bones[NUM_BONES];"<<endl;
shader.LoadFromFile(GL_VERTEX_SHADER, "shaders/shader.vert", str.str());
shader.LoadFromFile(GL_FRAGMENT_SHADER, "shaders/shader.frag");

This ensures that our vertex shader has the same number of bones as our mesh file.

Tip

For this simple recipe we have modified the shader code at loading time before compilation. Such shader modification must not be done at runtime as this will require a recompilation of the shader and will likely hamper application performance.

The Vertex struct storing all of our per-vertex attributes is defined as follows:

struct Vertex  {  
  glm::vec3 pos, 
        normal;
  glm::vec2 uv;
  glm::vec4 blendWeights;
  glm::ivec4 blendIndices;
};

The vertices array is filled in by the EzmLoader::Load function. We generate a vertex array object with a vertex buffer object to store our interleaved per-vertex attributes:

glGenVertexArrays(1, &vaoID);
glGenBuffers(1, &vboVerticesID);
glGenBuffers(1, &vboIndicesID);

glBindVertexArray(vaoID);
glBindBuffer (GL_ARRAY_BUFFER, vboVerticesID);
glBufferData (GL_ARRAY_BUFFER, sizeof(Vertex)*vertices.size(), &(vertices[0].pos.x), GL_DYNAMIC_DRAW);

glEnableVertexAttribArray(shader["vVertex"]);
glVertexAttribPointer(shader["vVertex"], 3, GL_FLOAT, GL_FALSE,sizeof(Vertex),0);

glEnableVertexAttribArray(shader["vNormal"]);
glVertexAttribPointer(shader["vNormal"], 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, normal)) );

glEnableVertexAttribArray(shader["vUV"]);
glVertexAttribPointer(shader["vUV"], 2, GL_FLOAT, GL_FALSE, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, uv)) );

glEnableVertexAttribArray(shader["vBlendWeights"]);
glVertexAttribPointer(shader["vBlendWeights"], 4, GL_FLOAT, 
GL_FALSE, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, blendWeights)) );

glEnableVertexAttribArray(shader["viBlendIndices"]);
glVertexAttribIPointer(shader["viBlendIndices"], 4, GL_INT, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, blendIndices)) );

Tip

Note that for blend indices we use the glVertexAttribIPointer function, as the attribute (viBlendIndices) is defined as ivec4 in the vertex shader.

Finally, in the rendering code, we set the vertex array object and use the shader program. Then we iterate through all of the submeshes. We then set the material texture for the current submesh and set the shader uniforms. Finally, we issue a glDrawElements call:

glBindVertexArray(vaoID); {
shader.Use();
glUniformMatrix4fv(shader("MV"), 1, GL_FALSE, glm::value_ptr(MV));
glUniformMatrix3fv(shader("N"), 1, GL_FALSE, glm::value_ptr(glm::inverseTranspose(glm::mat3(MV))));
glUniformMatrix4fv(shader("P"), 1, GL_FALSE, glm::value_ptr(P));
glUniform3fv(shader("light_position"),1, &(lightPosOS.x));
   for(size_t i=0;i<submeshes.size();i++) {
   if(strlen(submeshes[i].materialName)>0) {
        GLuint id = materialMap[ material2ImageMap[ 
                  submeshes[i].materialName]];
    GLint whichID[1];
    glGetIntegerv(GL_TEXTURE_BINDING_2D, whichID);
    if(whichID[0] != id)
      glBindTexture(GL_TEXTURE_2D, id);
    glUniform1f(shader("useDefault"), 0.0);
   } else {
    glUniform1f(shader("useDefault"), 1.0);
   }
    glDrawElements(GL_TRIANGLES, submeshes[i].indices.size(), 
                   GL_UNSIGNED_INT, &submeshes[i].indices[0]);
  }//end for
shader.UnUse();
}

The matrix palette skinning is carried out on the GPU using the vertex shader (Chapter8/MatrixPaletteSkinning/shaders/shader.vert). We simply use the blend indices and blend weights to calculate the correct vertex position and normal based on the combined influence of all of the effecting bones. The Bones array contains the skinning matrices that we generated earlier. The complete vertex shader is as follows:

Tip

Note that the Bones uniform array is not declared in the shader, as it is filled in the shader code dynamically as was shown earlier.

#version 330 core
layout(location = 0) in vec3 vVertex;
layout(location = 1) in vec3 vNormal;
layout(location = 2) in vec2 vUV;
layout(location = 3) in vec4 vBlendWeights;
layout(location = 4) in ivec4 viBlendIndices;
smooth out vec2 vUVout;
uniform mat4 P; 
uniform mat4 MV;
uniform mat3 N;
smooth out vec3 vEyeSpaceNormal;    
smooth out vec3 vEyeSpacePosition;
void main() {
  vec4 blendVertex=vec4(0);
  vec3 blendNormal=vec3(0);
  vec4 vVertex4 = vec4(vVertex,1);
   
   int index = viBlendIndices.x;
   blendVertex = (Bones[index] * vVertex4) *  vBlendWeights.x;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz *    
                 vBlendWeights.x;
   
   index = viBlendIndices.y;        
   blendVertex = ((Bones[index] * vVertex4) * vBlendWeights.y) + 
                 blendVertex;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz * 
                 vBlendWeights.y  + blendNormal;
   
      index = viBlendIndices.z;
      blendVertex = ((Bones[index] * vVertex4) *  vBlendWeights.z)  
                    + blendVertex;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz *  
                 vBlendWeights.z  + blendNormal;
   
   index = viBlendIndices.w;        
   blendVertex = ((Bones[index] * vVertex4) *  vBlendWeights.w)   
                 + blendVertex;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz *  
                 vBlendWeights.w  + blendNormal;

      vEyeSpacePosition = (MV*blendVertex).xyz; 
      vEyeSpaceNormal   = normalize(N*blendNormal);  
      vUVout=vUV; 
   gl_Position = P*vec4(vEyeSpacePosition,1);
}

The fragment shader uses the attenuated point light source for illumination as we have seen in the Implementing per-fragment point light with attenuation recipe in Chapter 4, Lights and Shadows.

There's more…

The output from the demo application for this recipe shows the dude.ezm model animating using the matrix palette skinning technique as shown in the following figure. The light source can be rotated by right-clicking on it and dragging. Pressing the l key stops the loop playback.

How to do it…

Let us start our recipe by following these simple steps:

Load the EZMesh model as we did in the Implementing EZMesh loader recipe from Chapter 5, Mesh Model Formats and Particle System. In addition to the model submeshes, vertices, normals, texture coordinates, and materials, we also load the skeleton information from the EZMesh file.
```
EzmLoader ezm;
if(!ezm.Load(mesh_filename.c_str(), skeleton, animations,  
             submeshes, vertices, indices, material2ImageMap,  
             min, max)) {
  cout<<"Cannot load the EZMesh file"<<endl;
  exit(EXIT_FAILURE);
} 
```

Get the MeshSystem object from the meshImportLibrary object. Then load the bone transformations contained in the EZMesh file using the MeshSystem::mSkeletons array. This is carried out in the EzmLoader::Load function. Also generate absolute bone transforms from the relative transforms. This is done so that the transform of the child bone is influenced by the transform of the parent bone. This is continued up the hierarchy until the root bone. If the mesh is modeled in a positive Z axis system, we need to modify the orientation, positions, and scale by swapping Y and Z axes and changing the sign of one of them. This is done because we are using a positive Y axis system in OpenGL; otherwise, our mesh will be lying in the XZ plane rather than the XY plane. We obtain a combined matrix from the position orientation and scale of the bone. This is stored in the xform field, which is the relative transform of the bone.

if(ms->mSkeletonCount>0) {
  NVSHARE::MeshSkeleton* pSkel = ms->mSkeletons[0];
  Bone b;
  for(int i=0;i<pSkel->GetBoneCount();i++) {
      const NVSHARE::MeshBone pBone = pSkel->mBones[i];
      const int s = strlen(pBone.mName);
      b.name = new char[s+1];
      memset(b.name, 0, sizeof(char)*(s+1));
      strncpy_s(b.name,sizeof(char)*(s+1), pBone.mName, s);
      b.orientation = glm::quat( 
      pBone.mOrientation[3],pBone.mOrientation[0],  
      pBone.mOrientation[1],pBone.mOrientation[2]);
      b.position = glm::vec3( pBone.mPosition[0], 
      pBone.mPosition[1],pBone.mPosition[2]);
      b.scale  = glm::vec3(pBone.mScale[0], pBone.mScale[1],   
      pBone.mScale[2]);

      if(!bYup) {
         float tmp = b.position.y;
         b.position.y = b.position.z;
         b.position.z = -tmp;
         tmp = b.orientation.y;
         b.orientation.y = b.orientation.z;
         b.orientation.z = -tmp;
         tmp = b.scale.y;
         b.scale.y = b.scale.z;
         
b.scale.z = -tmp;
      }      

      glm::mat4 S = glm::scale(glm::mat4(1), b.scale);
      glm::mat4 R = glm::toMat4(b.orientation);              
      glm::mat4 T = glm::translate(glm::mat4(1), b.position);

      b.xform = T*R*S;         
      b.parent = pBone.mParentIndex;   skeleton.push_back(b);
   }
   UpdateCombinedMatrices();
   bindPose.resize(skeleton.size()); 
   invBindPose.resize(skeleton.size());
   animatedXform.resize(skeleton.size());

Generate the bind pose and inverse bind pose arrays from the stored bone transformations:

   for(size_t i=0;i<skeleton.size();i++) {
      bindPose[i] = (skeleton[i].comb);              
      invBindPose[i] = glm::inverse(bindPose[i]);
   }}

Store the blend weights and blend indices of each vertex in the mesh:

 mesh.vertices[j].blendWeights.x = pMesh->mVertices[j].mWeight[0];
 mesh.vertices[j].blendWeights.y = pMesh->mVertices[j].mWeight[1];
 mesh.vertices[j].blendWeights.z = pMesh->mVertices[j].mWeight[2];
 mesh.vertices[j].blendWeights.w = pMesh->mVertices[j].mWeight[3];
 mesh.vertices[j].blendIndices[0] = pMesh->mVertices[j].mBone[0];
 mesh.vertices[j].blendIndices[1] = pMesh->mVertices[j].mBone[1];
 mesh.vertices[j].blendIndices[2] = pMesh->mVertices[j].mBone[2];
 mesh.vertices[j].blendIndices[3] = pMesh->mVertices[j].mBone[3];

In the idle callback function, calculate the amount of time to spend on the current frame. If the amount has elapsed, move to the next frame and reset the time. After this, calculate the new bone transformations as well as new skinning matrices, and pass them to the shader:

   QueryPerformanceCounter(&current);
   dt = (double)(current.QuadPart - last.QuadPart) /  
   (double)freq.QuadPart;
   last = current;
   static double t  = 0;
   t+=dt;
   NVSHARE::MeshAnimation* pAnim = &animations[0];
   float framesPerSecond = pAnim->GetFrameCount()/
                           pAnim->GetDuration();
   if( t > 1.0f/ framesPerSecond) {
      currentFrame++; 
      t=0;
   }  
   if(bLoop) {
      currentFrame = currentFrame%pAnim->mFrameCount;
   } else {
      currentFrame=max(-1,min(currentFrame,pAnim->mFrameCount-1));
   }
  if(currentFrame == -1) {
      for(size_t i=0;i<skeleton.size();i++) {
         skeleton[i].comb = bindPose[i];          
         animatedXform[i] = skeleton[i].comb*invBindPose[i];
      }
   } 
   else {
      for(int j=0;j<pAnim->mTrackCount;j++) {
         NVSHARE::MeshAnimTrack* pTrack = pAnim->mTracks[j]; 
         NVSHARE::MeshAnimPose* pPose = 
         pTrack->GetPose(currentFrame);
         skeleton[j].position.x = pPose->mPos[0];
         skeleton[j].position.y = pPose->mPos[1];
         skeleton[j].position.z = pPose->mPos[2];

         glm::quat q;
         q.x = pPose->mQuat[0];
         q.y = pPose->mQuat[1];
         q.z = pPose->mQuat[2];
         q.w = pPose->mQuat[3];

         skeleton[j].scale  = glm::vec3(pPose->mScale[0], 
                                        pPose->mScale[1], 
                                        pPose->mScale[2]);
         if(!bYup) {
            skeleton[j].position.y = pPose->mPos[2];
            skeleton[j].position.z = -pPose->mPos[1];
            q.y = pPose->mQuat[2];
            q.z = -pPose->mQuat[1];
            skeleton[j].scale.y = pPose->mScale[2];
            skeleton[j].scale.z = -pPose->mScale[1];
         }

         skeleton[j].orientation = q;

         glm::mat4 S =glm::scale(glm::mat4(1),skeleton[j].scale);
         glm::mat4 R = glm::toMat4(q);
         glm::mat4 T = glm::translate(glm::mat4(1), skeleton[j].position);

         skeleton[j].xform = T*R*S;
         Bone& b = skeleton[j];

         if(b.parent==-1)
            b.comb = b.xform;
         else
            b.comb = skeleton[b.parent].comb * b.xform;

         animatedXform[j] = b.comb * invBindPose[j] ;
      }
   }
   shader.Use();
      glUniformMatrix4fv(shader("Bones"),animatedXform.size(), 
      GL_FALSE,glm::value_ptr(animatedXform[0]));
   shader.UnUse();
   shader.UnUse();

How it works…

There are two parts of this recipe: generation of skinning matrices and the calculation of GPU skinning in the vertex shader. To understand the first step, we will start with the different transforms that will be used in skinning. Typically, in a simulation or game, a transform is represented as a 4×4 matrix. For skeletal animation, we have a collection of bones. Each bone has a local transform (also called relative transform), which tells how the bone is positioned and oriented with respect to its parent bone. If the bone's local transform is multiplied to the global transform of its parent, we get the global transform (also called absolute transform) of the bone. Typically, the animation formats store the local transforms of the bones in the file. The user application uses this information to generate the global transforms.

We define our bone structure as follows:

struct Bone {
   glm::quat orientation;
   glm::vec3 position;
   glm::mat4 xform, comb;
   glm::vec3 scale;
   char* name; 
   int parent;
};

The first field is orientation, which is a quaternion storing the orientation of bone in space relative to its parent. The position field stores its position relative to its parent. The xform field is the local (relative) transform, and the comb field is the global (absolute) transform. The scale field contains the scaling transformation of the bone. In the big picture, the scale field gives the scaling matrix (S), the orientation field gives the rotation matrix (R), and the position field gives the translation matrix (T). The combined matrix T*R*S gives us the relative transform that is calculated when we load the bone information from the EZMesh file in the second step.

The name field is the unique name of the bone in the skeleton. Finally, the parent field stores the index of the parent of the current bone in the skeleton array. For the root bone, the parent is -1. For all of the other bones, it will be a number starting from 0 to N-1, where N is the total number of bones in the skeleton.

After we have loaded and stored the relative transforms of each bone in the skeleton, we iterate through each bone to obtain its absolute transform. This is carried out in the UpdateCombinedMatrices function in Chapter8/MatrixPaletteSkinning/main.cpp.

for(size_t i=0;i<skeleton.size();i++) {
   Bone& b = skeleton[i];
   if(b.parent==-1)
      b.comb = b.xform;
   else
      b.comb = skeleton[b.parent].comb * b.xform;
}

After generating the absolute transforms of each bone, we store the bind pose and inverse bind pose matrices of the skeleton. Simply put, bind pose is the absolute transforms of the bones in the non-animated state (that is, when no animation is applied). This is usually when the skeleton is attached (skinned) to the geometry. In other words, it is the default pose of the skeletal animated mesh. Typically, bones can be in any bind pose (usually, for humanoid characters, the character may be in A pose, T pose, and so on based on the convention used). Typically, the inverse bind pose is stored at the time of initialization. So, continuing to the previous skeleton example, we can get the bind pose and inverse bind pose matrices as follows:

for(size_t i=0;i < skeleton.size(); i++) {
   bindPose[i] = skeleton[i].comb;                       
   invBindPose[i] = glm::inverse(bindPose[i]);
}

Note that we do this once at initialization, so that we do not have to calculate the bind pose's inverse every frame, as it is required during animation.

When we apply any new transformation (an animation sequence for example) to the skeleton, we have to first undo the bind pose transformation. This is done by multiplying the animated transformation with the inverse of the bind pose transformation. This is required because the given relative transformations will add to the existing transformations of the bone and, if the bind pose transformation is not undone, the animation output will be wrong.

The final matrix that we get from this process is called the skinning matrix (also called the final bone matrix). Continuing from the example given in the previous paragraph, let's say we have modified the relative transforms of bone using the animation sequence. We can then generate the skinning matrix as follows:

for(size_t i=0;i < skeleton.size(); i++) {
   Bone& b = skeleton[i];
   if(b.parent==-1)
      b.comb = b.xform;
   else
      b.comb = skeleton[b.parent].comb * b.xform;
   animatedXForm[i] = b.comb*invBindPose[i];
}

One thing to note here is the order of the different matrices. As you can see, we right multiply the inverse bind pose matrix with the combined bone transform. We put it this way because OpenGL and glm matrices work right to left. So the inverse of bind pose matrix will be multiplied by the given vertex first. Then it will be multiplied with the local transform (xform) of the current bone. Finally, it will be multiplied with the global transform (comb) of its parent to get the final transformation matrix.

After we have calculated the skinning matrices, we pass these to the GPU in a single call:

shader.Use();
   glUniformMatrix4fv(shader("Bones"), animatedXForm.size(), 
   GL_FALSE, glm::value_ptr(animatedXForm[0]));       
shader.UnUse();

To make sure that the size of the bones array is correct in the vertex shader, we append text to the shader dynamically by using the overloaded GLSLShader::LoadFromFile function.

stringstream str( ios_base::app | ios_base::out);
str<<"\nconst int NUM_BONES="<<skeleton.size()<<";"<<endl;
str<<"uniform mat4 Bones[NUM_BONES];"<<endl;
shader.LoadFromFile(GL_VERTEX_SHADER, "shaders/shader.vert", str.str());
shader.LoadFromFile(GL_FRAGMENT_SHADER, "shaders/shader.frag");

This ensures that our vertex shader has the same number of bones as our mesh file.

Tip

For this simple recipe we have modified the shader code at loading time before compilation. Such shader modification must not be done at runtime as this will require a recompilation of the shader and will likely hamper application performance.

The Vertex struct storing all of our per-vertex attributes is defined as follows:

struct Vertex  {  
  glm::vec3 pos, 
        normal;
  glm::vec2 uv;
  glm::vec4 blendWeights;
  glm::ivec4 blendIndices;
};

The vertices array is filled in by the EzmLoader::Load function. We generate a vertex array object with a vertex buffer object to store our interleaved per-vertex attributes:

glGenVertexArrays(1, &vaoID);
glGenBuffers(1, &vboVerticesID);
glGenBuffers(1, &vboIndicesID);

glBindVertexArray(vaoID);
glBindBuffer (GL_ARRAY_BUFFER, vboVerticesID);
glBufferData (GL_ARRAY_BUFFER, sizeof(Vertex)*vertices.size(), &(vertices[0].pos.x), GL_DYNAMIC_DRAW);

glEnableVertexAttribArray(shader["vVertex"]);
glVertexAttribPointer(shader["vVertex"], 3, GL_FLOAT, GL_FALSE,sizeof(Vertex),0);

glEnableVertexAttribArray(shader["vNormal"]);
glVertexAttribPointer(shader["vNormal"], 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, normal)) );

glEnableVertexAttribArray(shader["vUV"]);
glVertexAttribPointer(shader["vUV"], 2, GL_FLOAT, GL_FALSE, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, uv)) );

glEnableVertexAttribArray(shader["vBlendWeights"]);
glVertexAttribPointer(shader["vBlendWeights"], 4, GL_FLOAT, 
GL_FALSE, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, blendWeights)) );

glEnableVertexAttribArray(shader["viBlendIndices"]);
glVertexAttribIPointer(shader["viBlendIndices"], 4, GL_INT, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, blendIndices)) );

Tip

Note that for blend indices we use the glVertexAttribIPointer function, as the attribute (viBlendIndices) is defined as ivec4 in the vertex shader.

Finally, in the rendering code, we set the vertex array object and use the shader program. Then we iterate through all of the submeshes. We then set the material texture for the current submesh and set the shader uniforms. Finally, we issue a glDrawElements call:

glBindVertexArray(vaoID); {
shader.Use();
glUniformMatrix4fv(shader("MV"), 1, GL_FALSE, glm::value_ptr(MV));
glUniformMatrix3fv(shader("N"), 1, GL_FALSE, glm::value_ptr(glm::inverseTranspose(glm::mat3(MV))));
glUniformMatrix4fv(shader("P"), 1, GL_FALSE, glm::value_ptr(P));
glUniform3fv(shader("light_position"),1, &(lightPosOS.x));
   for(size_t i=0;i<submeshes.size();i++) {
   if(strlen(submeshes[i].materialName)>0) {
        GLuint id = materialMap[ material2ImageMap[ 
                  submeshes[i].materialName]];
    GLint whichID[1];
    glGetIntegerv(GL_TEXTURE_BINDING_2D, whichID);
    if(whichID[0] != id)
      glBindTexture(GL_TEXTURE_2D, id);
    glUniform1f(shader("useDefault"), 0.0);
   } else {
    glUniform1f(shader("useDefault"), 1.0);
   }
    glDrawElements(GL_TRIANGLES, submeshes[i].indices.size(), 
                   GL_UNSIGNED_INT, &submeshes[i].indices[0]);
  }//end for
shader.UnUse();
}

The matrix palette skinning is carried out on the GPU using the vertex shader (Chapter8/MatrixPaletteSkinning/shaders/shader.vert). We simply use the blend indices and blend weights to calculate the correct vertex position and normal based on the combined influence of all of the effecting bones. The Bones array contains the skinning matrices that we generated earlier. The complete vertex shader is as follows:

Tip

Note that the Bones uniform array is not declared in the shader, as it is filled in the shader code dynamically as was shown earlier.

#version 330 core
layout(location = 0) in vec3 vVertex;
layout(location = 1) in vec3 vNormal;
layout(location = 2) in vec2 vUV;
layout(location = 3) in vec4 vBlendWeights;
layout(location = 4) in ivec4 viBlendIndices;
smooth out vec2 vUVout;
uniform mat4 P; 
uniform mat4 MV;
uniform mat3 N;
smooth out vec3 vEyeSpaceNormal;    
smooth out vec3 vEyeSpacePosition;
void main() {
  vec4 blendVertex=vec4(0);
  vec3 blendNormal=vec3(0);
  vec4 vVertex4 = vec4(vVertex,1);
   
   int index = viBlendIndices.x;
   blendVertex = (Bones[index] * vVertex4) *  vBlendWeights.x;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz *    
                 vBlendWeights.x;
   
   index = viBlendIndices.y;        
   blendVertex = ((Bones[index] * vVertex4) * vBlendWeights.y) + 
                 blendVertex;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz * 
                 vBlendWeights.y  + blendNormal;
   
      index = viBlendIndices.z;
      blendVertex = ((Bones[index] * vVertex4) *  vBlendWeights.z)  
                    + blendVertex;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz *  
                 vBlendWeights.z  + blendNormal;
   
   index = viBlendIndices.w;        
   blendVertex = ((Bones[index] * vVertex4) *  vBlendWeights.w)   
                 + blendVertex;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz *  
                 vBlendWeights.w  + blendNormal;

      vEyeSpacePosition = (MV*blendVertex).xyz; 
      vEyeSpaceNormal   = normalize(N*blendNormal);  
      vUVout=vUV; 
   gl_Position = P*vec4(vEyeSpacePosition,1);
}

The fragment shader uses the attenuated point light source for illumination as we have seen in the Implementing per-fragment point light with attenuation recipe in Chapter 4, Lights and Shadows.

There's more…

The output from the demo application for this recipe shows the dude.ezm model animating using the matrix palette skinning technique as shown in the following figure. The light source can be rotated by right-clicking on it and dragging. Pressing the l key stops the loop playback.

How it works…

There are two parts of this recipe: generation of skinning matrices and the calculation of GPU skinning in the vertex shader. To understand the first step, we will start with the different transforms that will be used in skinning. Typically, in a simulation or game, a transform is represented as a 4×4 matrix. For skeletal animation, we have a collection of bones. Each bone has a local transform (also called relative transform), which tells how the bone is positioned and oriented with respect to its parent bone. If the bone's local transform is multiplied to the global transform of its parent, we get the global transform (also called absolute transform) of the bone. Typically, the animation formats store the local transforms of the bones in the file. The user application uses this information to generate the global transforms.

We define our bone structure as follows:

struct Bone {
   glm::quat orientation;
   glm::vec3 position;
   glm::mat4 xform, comb;
   glm::vec3 scale;
   char* name; 
   int parent;
};

The first field is orientation, which is a quaternion storing the orientation of bone in space relative to its parent. The position field stores its position relative to its parent. The xform field is the local (relative) transform, and the comb field is the global (absolute) transform. The scale field contains the scaling transformation of the bone. In the big picture, the scale field gives the scaling matrix (S), the orientation field gives the rotation matrix (R), and the position field gives the translation matrix (T). The combined matrix T*R*S gives us the relative transform that is calculated when we load the bone information from the EZMesh file in the second step.

The name field is the unique name of the bone in the skeleton. Finally, the parent field stores the index of the parent of the current bone in the skeleton array. For the root bone, the parent is -1. For all of the other bones, it will be a number starting from 0 to N-1, where N is the total number of bones in the skeleton.

After we have loaded and stored the relative transforms of each bone in the skeleton, we iterate through each bone to obtain its absolute transform. This is carried out in the UpdateCombinedMatrices function in Chapter8/MatrixPaletteSkinning/main.cpp.

for(size_t i=0;i<skeleton.size();i++) {
   Bone& b = skeleton[i];
   if(b.parent==-1)
      b.comb = b.xform;
   else
      b.comb = skeleton[b.parent].comb * b.xform;
}

After generating the absolute transforms of each bone, we store the bind pose and inverse bind pose matrices of the skeleton. Simply put, bind pose is the absolute transforms of the bones in the non-animated state (that is, when no animation is applied). This is usually when the skeleton is attached (skinned) to the geometry. In other words, it is the default pose of the skeletal animated mesh. Typically, bones can be in any bind pose (usually, for humanoid characters, the character may be in A pose, T pose, and so on based on the convention used). Typically, the inverse bind pose is stored at the time of initialization. So, continuing to the previous skeleton example, we can get the bind pose and inverse bind pose matrices as follows:

for(size_t i=0;i < skeleton.size(); i++) {
   bindPose[i] = skeleton[i].comb;                       
   invBindPose[i] = glm::inverse(bindPose[i]);
}

Note that we do this once at initialization, so that we do not have to calculate the bind pose's inverse every frame, as it is required during animation.

When we apply any new transformation (an animation sequence for example) to the skeleton, we have to first undo the bind pose transformation. This is done by multiplying the animated transformation with the inverse of the bind pose transformation. This is required because the given relative transformations will add to the existing transformations of the bone and, if the bind pose transformation is not undone, the animation output will be wrong.

The final matrix that we get from this process is called the skinning matrix (also called the final bone matrix). Continuing from the example given in the previous paragraph, let's say we have modified the relative transforms of bone using the animation sequence. We can then generate the skinning matrix as follows:

for(size_t i=0;i < skeleton.size(); i++) {
   Bone& b = skeleton[i];
   if(b.parent==-1)
      b.comb = b.xform;
   else
      b.comb = skeleton[b.parent].comb * b.xform;
   animatedXForm[i] = b.comb*invBindPose[i];
}

One thing to note here is the order of the different matrices. As you can see, we right multiply the inverse bind pose matrix with the combined bone transform. We put it this way because OpenGL and glm matrices work right to left. So the inverse of bind pose matrix will be multiplied by the given vertex first. Then it will be multiplied with the local transform (xform) of the current bone. Finally, it will be multiplied with the global transform (comb) of its parent to get the final transformation matrix.

After we have calculated the skinning matrices, we pass these to the GPU in a single call:

shader.Use();
   glUniformMatrix4fv(shader("Bones"), animatedXForm.size(), 
   GL_FALSE, glm::value_ptr(animatedXForm[0]));       
shader.UnUse();

To make sure that the size of the bones array is correct in the vertex shader, we append text to the shader dynamically by using the overloaded GLSLShader::LoadFromFile function.

stringstream str( ios_base::app | ios_base::out);
str<<"\nconst int NUM_BONES="<<skeleton.size()<<";"<<endl;
str<<"uniform mat4 Bones[NUM_BONES];"<<endl;
shader.LoadFromFile(GL_VERTEX_SHADER, "shaders/shader.vert", str.str());
shader.LoadFromFile(GL_FRAGMENT_SHADER, "shaders/shader.frag");

This ensures that our vertex shader has the same number of bones as our mesh file.

Tip

For this simple recipe we have modified the shader code at loading time before compilation. Such shader modification must not be done at runtime as this will require a recompilation of the shader and will likely hamper application performance.

The Vertex struct storing all of our per-vertex attributes is defined as follows:

struct Vertex  {  
  glm::vec3 pos, 
        normal;
  glm::vec2 uv;
  glm::vec4 blendWeights;
  glm::ivec4 blendIndices;
};

The vertices array is filled in by the EzmLoader::Load function. We generate a vertex array object with a vertex buffer object to store our interleaved per-vertex attributes:

glGenVertexArrays(1, &vaoID);
glGenBuffers(1, &vboVerticesID);
glGenBuffers(1, &vboIndicesID);

glBindVertexArray(vaoID);
glBindBuffer (GL_ARRAY_BUFFER, vboVerticesID);
glBufferData (GL_ARRAY_BUFFER, sizeof(Vertex)*vertices.size(), &(vertices[0].pos.x), GL_DYNAMIC_DRAW);

glEnableVertexAttribArray(shader["vVertex"]);
glVertexAttribPointer(shader["vVertex"], 3, GL_FLOAT, GL_FALSE,sizeof(Vertex),0);

glEnableVertexAttribArray(shader["vNormal"]);
glVertexAttribPointer(shader["vNormal"], 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, normal)) );

glEnableVertexAttribArray(shader["vUV"]);
glVertexAttribPointer(shader["vUV"], 2, GL_FLOAT, GL_FALSE, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, uv)) );

glEnableVertexAttribArray(shader["vBlendWeights"]);
glVertexAttribPointer(shader["vBlendWeights"], 4, GL_FLOAT, 
GL_FALSE, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, blendWeights)) );

glEnableVertexAttribArray(shader["viBlendIndices"]);
glVertexAttribIPointer(shader["viBlendIndices"], 4, GL_INT, sizeof(Vertex), (const GLvoid*)(offsetof(Vertex, blendIndices)) );

Tip

Note that for blend indices we use the glVertexAttribIPointer function, as the attribute (viBlendIndices) is defined as ivec4 in the vertex shader.

Finally, in the rendering code, we set the vertex array object and use the shader program. Then we iterate through all of the submeshes. We then set the material texture for the current submesh and set the shader uniforms. Finally, we issue a glDrawElements call:

glBindVertexArray(vaoID); {
shader.Use();
glUniformMatrix4fv(shader("MV"), 1, GL_FALSE, glm::value_ptr(MV));
glUniformMatrix3fv(shader("N"), 1, GL_FALSE, glm::value_ptr(glm::inverseTranspose(glm::mat3(MV))));
glUniformMatrix4fv(shader("P"), 1, GL_FALSE, glm::value_ptr(P));
glUniform3fv(shader("light_position"),1, &(lightPosOS.x));
   for(size_t i=0;i<submeshes.size();i++) {
   if(strlen(submeshes[i].materialName)>0) {
        GLuint id = materialMap[ material2ImageMap[ 
                  submeshes[i].materialName]];
    GLint whichID[1];
    glGetIntegerv(GL_TEXTURE_BINDING_2D, whichID);
    if(whichID[0] != id)
      glBindTexture(GL_TEXTURE_2D, id);
    glUniform1f(shader("useDefault"), 0.0);
   } else {
    glUniform1f(shader("useDefault"), 1.0);
   }
    glDrawElements(GL_TRIANGLES, submeshes[i].indices.size(), 
                   GL_UNSIGNED_INT, &submeshes[i].indices[0]);
  }//end for
shader.UnUse();
}

The matrix palette skinning is carried out on the GPU using the vertex shader (Chapter8/MatrixPaletteSkinning/shaders/shader.vert). We simply use the blend indices and blend weights to calculate the correct vertex position and normal based on the combined influence of all of the effecting bones. The Bones array contains the skinning matrices that we generated earlier. The complete vertex shader is as follows:

Tip

Note that the Bones uniform array is not declared in the shader, as it is filled in the shader code dynamically as was shown earlier.

#version 330 core
layout(location = 0) in vec3 vVertex;
layout(location = 1) in vec3 vNormal;
layout(location = 2) in vec2 vUV;
layout(location = 3) in vec4 vBlendWeights;
layout(location = 4) in ivec4 viBlendIndices;
smooth out vec2 vUVout;
uniform mat4 P; 
uniform mat4 MV;
uniform mat3 N;
smooth out vec3 vEyeSpaceNormal;    
smooth out vec3 vEyeSpacePosition;
void main() {
  vec4 blendVertex=vec4(0);
  vec3 blendNormal=vec3(0);
  vec4 vVertex4 = vec4(vVertex,1);
   
   int index = viBlendIndices.x;
   blendVertex = (Bones[index] * vVertex4) *  vBlendWeights.x;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz *    
                 vBlendWeights.x;
   
   index = viBlendIndices.y;        
   blendVertex = ((Bones[index] * vVertex4) * vBlendWeights.y) + 
                 blendVertex;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz * 
                 vBlendWeights.y  + blendNormal;
   
      index = viBlendIndices.z;
      blendVertex = ((Bones[index] * vVertex4) *  vBlendWeights.z)  
                    + blendVertex;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz *  
                 vBlendWeights.z  + blendNormal;
   
   index = viBlendIndices.w;        
   blendVertex = ((Bones[index] * vVertex4) *  vBlendWeights.w)   
                 + blendVertex;
   blendNormal = (Bones[index] * vec4(vNormal, 0.0)).xyz *  
                 vBlendWeights.w  + blendNormal;

      vEyeSpacePosition = (MV*blendVertex).xyz; 
      vEyeSpaceNormal   = normalize(N*blendNormal);  
      vUVout=vUV; 
   gl_Position = P*vec4(vEyeSpacePosition,1);
}

The fragment shader uses the attenuated point light source for illumination as we have seen in the Implementing per-fragment point light with attenuation recipe in Chapter 4, Lights and Shadows.

There's more…

The output from the demo application for this recipe shows the dude.ezm model animating using the matrix palette skinning technique as shown in the following figure. The light source can be rotated by right-clicking on it and dragging. Pressing the l key stops the loop playback.

There's more…

The output from the demo application for this recipe shows the dude.ezm model animating using the matrix palette skinning technique as shown in the following figure. The light source can be rotated by right-clicking on it and dragging. Pressing the l key stops the loop playback.

Implementing skeletal animation using dual quaternion skinning

Matrix palette skinning suffers from candy wrapping artefacts, especially in regions like shoulder and elbow, where there are several rotations across various axes. If dual quaternion skinning is employed, these artefacts are minimized. In this recipe we will implement skeletal animation using dual quaternion skinning.

Before understanding dual quaternions, let us first see what quaternions are. Quaternions are a mathematical entity containing three imaginary dimensions (which specify the axis of rotation) and a real dimension (which specifies the angle of rotation). Quaternions are used in 3D graphics to represent rotation, since they do not suffer from gimbal lock, as Euler angles do. In order to store translation with rotation simultaneously, dual quaternions are used to store dual number coefficients instead of real ones. Instead of four components, as in a quaternion, dual quaternions have eight components.

Even in dual quaternion skinning, the linear blend method is used. However, due to the nature of transformation in dual quaternion, spherical blending is preferred. After linear blending, the resulting dual quaternion is renormalized, which generates a spherical blending result, which is a better approximation as compared to the linear blend skinning. This whole process is illustrated by the following figure:

Getting ready

The code for this recipe is contained in the Chapter8/DualQuaternionSkinning folder. We will be building on top of the previous recipe and replace the skinning matrices with dual quaternions.

How to do it…

Converting linear blend skinning to dual quaternion skinning requires the following steps:

Load the EZMesh model as we did in the Implementing EZMesh loader recipe from Chapter 5, Mesh Model Formats and Particle System:

   if(!ezm.Load(mesh_filename.c_str(), skeleton, animations,    
   submeshes, vertices, indices, material2ImageMap, min, max))   { 
      cout<<"Cannot load the EZMesh file"<<endl;
      exit(EXIT_FAILURE); }

After loading
up the mesh, materials, and textures, load the bone transformations contained in the EZMesh file using the MeshSystem::mSkeletons array as we did in the previous recipe. In addition to the bone matrices, also store the bind pose and inverse bind pose matrices as we did in the previous recipe. Instead of storing the skinning matrices, we initialize a vector of dual quaternions. Dual quaternions are a different representation of the skinning matrices.
```
   UpdateCombinedMatrices();
   bindPose.resize(skeleton.size()); 
   invBindPose.resize(skeleton.size());
   animatedXform.resize(skeleton.size());
   dualQuaternions.resize(skeleton.size());
   for(size_t i=0;i<skeleton.size();i++) {
      bindPose[i] = (skeleton[i].comb);
      invBindPose[i] = glm::inverse(bindPose[i]);
   }
```

Implement the idle callback function as in the previous recipe. Here, in addition to calculating the skinning matrix, also calculate the dual quaternion for the given skinning matrix. After all of the joints are done, pass the dual quaternion to the shader:

   glm::mat4 S = glm::scale(glm::mat4(1),skeleton[j].scale);
   glm::mat4 R = glm::toMat4(q);      
   glm::mat4 T = glm::translate(glm::mat4(1), 
   skeleton[j].position);
   skeleton[j].xform = T*R*S;
   Bone& b = skeleton[j];
   if(b.parent==-1)
      b.comb = b.xform;
   else
      b.comb = skeleton[b.parent].comb * b.xform;
   animatedXform[j] = b.comb * invBindPose[j];
   glm::vec3 t = glm::vec3( animatedXform[j][3][0],   
   animatedXform[j][3][1], animatedXform[j][3][2]);
   dualQuaternions[j].QuatTrans2UDQ( 
   glm::toQuat(animatedXform[j]),  t);
   …
   shader.Use(); 
   glUniform4fv(shader("Bones"), skeleton.size()*2, 
   &(dualQuaternions[0].ordinary.x));  
   shader.UnUse();

In the vertex shader (Chapter8/DualQuaternionSkinning/shaders/shader.vert), calculate the skinning matrix from the passed dual quaternion and blend weights of the given vertices. Then proceed with the skinning matrix as we did in the previous recipe:

#version 330 core
layout(location = 0) in vec3 vVertex;
layout(location = 1) in vec3 vNormal;
layout(location = 2) in vec2 vUV;
layout(location = 3) in vec4 vBlendWeights;
layout(location = 4) in ivec4 viBlendIndices;
smooth out vec2 vUVout;
uniform mat4 P; 
uniform mat4 MV;
uniform mat3 N;
smooth out vec3 vEyeSpaceNormal;    
smooth out vec3 vEyeSpacePosition;

 void main() {
   vec4 blendVertex=vec4(0);
   vec3 blendNormal=vec3(0); 
   vec4 blendDQ[2];
   float yc = 1.0, zc = 1.0, wc = 1.0;
   if (dot(Bones[viBlendIndices.x * 2], 
           Bones[viBlendIndices.y * 2]) < 0.0)
           yc = -1.0;
   if (dot(Bones[viBlendIndices.x * 2], 
           Bones[viBlendIndices.z * 2]) < 0.0)
           zc = -1.0;
   if (dot(Bones[viBlendIndices.x * 2], 
           Bones[viBlendIndices.w * 2]) < 0.0)
           wc = -1.0;
   blendDQ[0] = Bones[viBlendIndices.x * 2] * vBlendWeights.x;
   blendDQ[1] = Bones[viBlendIndices.x * 2 + 1] * 
                vBlendWeights.x;
   blendDQ[0] += yc*Bones[viBlendIndices.y * 2] * 
                 vBlendWeights.y;
   blendDQ[1] += yc*Bones[viBlendIndices.y * 2 + 1] *  
                 vBlendWeights.y;
   blendDQ[0] += zc*Bones[viBlendIndices.z * 2] *  
                 vBlendWeights.z; 
   blendDQ[1] += zc*Bones[viBlendIndices.z * 2 + 1] *   
                 vBlendWeights.z;
   blendDQ[0] += wc*Bones[viBlendIndices.w * 2] *    
                 vBlendWeights.w;
   blendDQ[1] += wc*Bones[viBlendIndices.w * 2 + 1] *  
                 vBlendWeights.w;
   mat4 skinTransform = dualQuatToMatrix(blendDQ[0], 
                        blendDQ[1]);
   blendVertex = skinTransform*vec4(vVertex,1);
   blendNormal = (skinTransform*vec4(vNormal,0)).xyz;
   vEyeSpacePosition = (MV*blendVertex).xyz; 
   vEyeSpaceNormal   = N*blendNormal;  
   vUVout=vUV; 
   gl_Position = P*vec4(vEyeSpacePosition,1);
}

To convert the given dual quaternion to a matrix, we define a function dualQuatToMatrix. This gives us a matrix, which we can then multiply with the vertex to obtain the transformed result.

How it works…

The only difference in this recipe and the previous recipe is the creation of a dual quaternion from the skinning matrix on the CPU, and its conversion back to a matrix in the vertex shader. After we have obtained the skinning matrices, we convert them into a dual quaternion array by using the dual_quat::QuatTrans2UDQ function that gets a dual quaternion from a rotation quaternion and a translation vector. This function is defined as follows in the dual_quat class (in Chapter8/DualQuaternionSkinning/main.cpp):

void QuatTrans2UDQ(const glm::quat& q0, const glm::vec3& t) {
  ordinary = q0;
  dual.w = -0.5f * ( t.x * q0.x + t.y * q0.y + t.z * q0.z);
  dual.x =  0.5f * ( t.x * q0.w + t.y * q0.z - t.z * q0.y);
  dual.y =  0.5f * (-t.x * q0.z + t.y * q0.w + t.z * q0.x);
  dual.z =  0.5f * ( t.x * q0.y - t.y * q0.x + t.z * q0.w);
}

The dual quaternion array is then passed to the shader instead of the bone matrices. In the vertex shader, we first do a dot product of the ordinary quaternion with the dual quaternion. If the dot product of the two quaternions is less than zero, it means they are both facing in opposite direction. We thus subtract the quaternion from the blended dual quaternion, otherwise we add it to the blended dual quaternion:

float yc = 1.0, zc = 1.0, wc = 1.0;

if (dot(Bones[viBlendIndices.x * 2], 
        Bones[viBlendIndices.y * 2]) < 0.0)
   yc = -1.0;

if (dot(Bones[viBlendIndices.x * 2], 
        Bones[viBlendIndices.z * 2]) < 0.0)
   zc = -1.0;

if (dot(Bones[viBlendIndices.x * 2], 
        Bones[viBlendIndices.w * 2]) < 0.0)
   wc = -1.0;

blendDQ[0] = Bones[viBlendIndices.x * 2] * vBlendWeights.x;
blendDQ[1] = Bones[viBlendIndices.x * 2 + 1] * vBlendWeights.x;

blendDQ[0] += yc*Bones[viBlendIndices.y * 2] * vBlendWeights.y;
blendDQ[1] += yc*Bones[viBlendIndices.y * 2 +1] * vBlendWeights.y;

blendDQ[0] += zc*Bones[viBlendIndices.z * 2] * vBlendWeights.z;
blendDQ[1] += zc*Bones[viBlendIndices.z * 2 +1] * vBlendWeights.z;

blendDQ[0] += wc*Bones[viBlendIndices.w * 2] * vBlendWeights.w;
blendDQ[1] += wc*Bones[viBlendIndices.w * 2 +1] * vBlendWeights.w;

The blended dual quaternion (blendDQ) is then converted to a matrix by the dualQuatToMatrix function, which is defined as follows:

mat4 dualQuatToMatrix(vec4 Qn, vec4 Qd) {  
  mat4 M;
  float len2 = dot(Qn, Qn);
  float w = Qn.w, x = Qn.x, y = Qn.y, z = Qn.z;
  float t0 = Qd.w, t1 = Qd.x, t2 = Qd.y, t3 = Qd.z;

  M[0][0] = w*w + x*x - y*y - z*z;
  M[0][1] = 2 * x * y + 2 * w * z;
  M[0][2] = 2 * x * z - 2 * w * y;
  M[0][3] = 0;

  M[1][0] = 2 * x * y - 2 * w * z;
  M[1][1] = w * w + y * y - x * x - z * z;
  M[1][2] = 2 * y * z + 2 * w * x;
  M[1][3] = 0;

  M[2][0] = 2 * x * z + 2 * w * y;
  M[2][1] = 2 * y * z - 2 * w * x;
  M[2][2] = w * w + z * z - x * x - y * y;
  M[2][3] = 0;

  M[3][0] = -2 * t0 * x + 2 * w * t1 - 2 * t2 * z + 2 * y * t3;
  M[3][1] = -2 * t0 * y + 2 * t1 * z - 2 * x * t3 + 2 * w * t2;
  M[3][2] = -2 * t0 * z + 2 * x * t2 + 2 * w * t3 - 2 * t1 * y;
  M[3][3] = len2;

  M /= len2;

  return M;  
}

The returned matrix is then multiplied with the given vertex/normal, and then the eye space position/normal and texture coordinates are obtained. Finally, the clip space position is calculated:

mat4 skinTransform = dualQuatToMatrix(blendDQ[0], blendDQ[1]);
blendVertex = skinTransform*vec4(vVertex,1);
blendNormal = (skinTransform*vec4(vNormal,0)).xyz;
vEyeSpacePosition = (MV*blendVertex).xyz; 
vEyeSpaceNormal   = N*blendNormal;  
vUVout=vUV; 
gl_Position = P*vec4(vEyeSpacePosition,1);

The fragment shader follows the similar steps as it did in the previous recipe to output the lit textured fragments.

There's more…

The demo application for this recipe renders the dwarf_anim.ezm skeletal model. Even with extreme rotation at the shoulder joint, the output does not suffer from candy wrapper artefacts as shown in the following figure:

On the other hand, if we use the matrix palette skinning, we get the following output, which clearly shows the candy wrapper artefacts:

How to do it…

Converting linear blend skinning to dual quaternion skinning requires the following steps:

Load the EZMesh model as we did in the Implementing EZMesh loader recipe from Chapter 5, Mesh Model Formats and Particle System:

   if(!ezm.Load(mesh_filename.c_str(), skeleton, animations,    
   submeshes, vertices, indices, material2ImageMap, min, max))   { 
      cout<<"Cannot load the EZMesh file"<<endl;
      exit(EXIT_FAILURE); }

After loading
up the mesh, materials, and textures, load the bone transformations contained in the EZMesh file using the MeshSystem::mSkeletons array as we did in the previous recipe. In addition to the bone matrices, also store the bind pose and inverse bind pose matrices as we did in the previous recipe. Instead of storing the skinning matrices, we initialize a vector of dual quaternions. Dual quaternions are a different representation of the skinning matrices.
```
   UpdateCombinedMatrices();
   bindPose.resize(skeleton.size()); 
   invBindPose.resize(skeleton.size());
   animatedXform.resize(skeleton.size());
   dualQuaternions.resize(skeleton.size());
   for(size_t i=0;i<skeleton.size();i++) {
      bindPose[i] = (skeleton[i].comb);
      invBindPose[i] = glm::inverse(bindPose[i]);
   }
```

Implement the idle callback function as in the previous recipe. Here, in addition to calculating the skinning matrix, also calculate the dual quaternion for the given skinning matrix. After all of the joints are done, pass the dual quaternion to the shader:

   glm::mat4 S = glm::scale(glm::mat4(1),skeleton[j].scale);
   glm::mat4 R = glm::toMat4(q);      
   glm::mat4 T = glm::translate(glm::mat4(1), 
   skeleton[j].position);
   skeleton[j].xform = T*R*S;
   Bone& b = skeleton[j];
   if(b.parent==-1)
      b.comb = b.xform;
   else
      b.comb = skeleton[b.parent].comb * b.xform;
   animatedXform[j] = b.comb * invBindPose[j];
   glm::vec3 t = glm::vec3( animatedXform[j][3][0],   
   animatedXform[j][3][1], animatedXform[j][3][2]);
   dualQuaternions[j].QuatTrans2UDQ( 
   glm::toQuat(animatedXform[j]),  t);
   …
   shader.Use(); 
   glUniform4fv(shader("Bones"), skeleton.size()*2, 
   &(dualQuaternions[0].ordinary.x));  
   shader.UnUse();

In the vertex shader (Chapter8/DualQuaternionSkinning/shaders/shader.vert), calculate the skinning matrix from the passed dual quaternion and blend weights of the given vertices. Then proceed with the skinning matrix as we did in the previous recipe:

#version 330 core
layout(location = 0) in vec3 vVertex;
layout(location = 1) in vec3 vNormal;
layout(location = 2) in vec2 vUV;
layout(location = 3) in vec4 vBlendWeights;
layout(location = 4) in ivec4 viBlendIndices;
smooth out vec2 vUVout;
uniform mat4 P; 
uniform mat4 MV;
uniform mat3 N;
smooth out vec3 vEyeSpaceNormal;    
smooth out vec3 vEyeSpacePosition;

 void main() {
   vec4 blendVertex=vec4(0);
   vec3 blendNormal=vec3(0); 
   vec4 blendDQ[2];
   float yc = 1.0, zc = 1.0, wc = 1.0;
   if (dot(Bones[viBlendIndices.x * 2], 
           Bones[viBlendIndices.y * 2]) < 0.0)
           yc = -1.0;
   if (dot(Bones[viBlendIndices.x * 2], 
           Bones[viBlendIndices.z * 2]) < 0.0)
           zc = -1.0;
   if (dot(Bones[viBlendIndices.x * 2], 
           Bones[viBlendIndices.w * 2]) < 0.0)
           wc = -1.0;
   blendDQ[0] = Bones[viBlendIndices.x * 2] * vBlendWeights.x;
   blendDQ[1] = Bones[viBlendIndices.x * 2 + 1] * 
                vBlendWeights.x;
   blendDQ[0] += yc*Bones[viBlendIndices.y * 2] * 
                 vBlendWeights.y;
   blendDQ[1] += yc*Bones[viBlendIndices.y * 2 + 1] *  
                 vBlendWeights.y;
   blendDQ[0] += zc*Bones[viBlendIndices.z * 2] *  
                 vBlendWeights.z; 
   blendDQ[1] += zc*Bones[viBlendIndices.z * 2 + 1] *   
                 vBlendWeights.z;
   blendDQ[0] += wc*Bones[viBlendIndices.w * 2] *    
                 vBlendWeights.w;
   blendDQ[1] += wc*Bones[viBlendIndices.w * 2 + 1] *  
                 vBlendWeights.w;
   mat4 skinTransform = dualQuatToMatrix(blendDQ[0], 
                        blendDQ[1]);
   blendVertex = skinTransform*vec4(vVertex,1);
   blendNormal = (skinTransform*vec4(vNormal,0)).xyz;
   vEyeSpacePosition = (MV*blendVertex).xyz; 
   vEyeSpaceNormal   = N*blendNormal;  
   vUVout=vUV; 
   gl_Position = P*vec4(vEyeSpacePosition,1);
}

To convert the given dual quaternion to a matrix, we define a function dualQuatToMatrix. This gives us a matrix, which we can then multiply with the vertex to obtain the transformed result.

How it works…

The only difference in this recipe and the previous recipe is the creation of a dual quaternion from the skinning matrix on the CPU, and its conversion back to a matrix in the vertex shader. After we have obtained the skinning matrices, we convert them into a dual quaternion array by using the dual_quat::QuatTrans2UDQ function that gets a dual quaternion from a rotation quaternion and a translation vector. This function is defined as follows in the dual_quat class (in Chapter8/DualQuaternionSkinning/main.cpp):

void QuatTrans2UDQ(const glm::quat& q0, const glm::vec3& t) {
  ordinary = q0;
  dual.w = -0.5f * ( t.x * q0.x + t.y * q0.y + t.z * q0.z);
  dual.x =  0.5f * ( t.x * q0.w + t.y * q0.z - t.z * q0.y);
  dual.y =  0.5f * (-t.x * q0.z + t.y * q0.w + t.z * q0.x);
  dual.z =  0.5f * ( t.x * q0.y - t.y * q0.x + t.z * q0.w);
}

The dual quaternion array is then passed to the shader instead of the bone matrices. In the vertex shader, we first do a dot product of the ordinary quaternion with the dual quaternion. If the dot product of the two quaternions is less than zero, it means they are both facing in opposite direction. We thus subtract the quaternion from the blended dual quaternion, otherwise we add it to the blended dual quaternion:

float yc = 1.0, zc = 1.0, wc = 1.0;

if (dot(Bones[viBlendIndices.x * 2], 
        Bones[viBlendIndices.y * 2]) < 0.0)
   yc = -1.0;

if (dot(Bones[viBlendIndices.x * 2], 
        Bones[viBlendIndices.z * 2]) < 0.0)
   zc = -1.0;

if (dot(Bones[viBlendIndices.x * 2], 
        Bones[viBlendIndices.w * 2]) < 0.0)
   wc = -1.0;

blendDQ[0] = Bones[viBlendIndices.x * 2] * vBlendWeights.x;
blendDQ[1] = Bones[viBlendIndices.x * 2 + 1] * vBlendWeights.x;

blendDQ[0] += yc*Bones[viBlendIndices.y * 2] * vBlendWeights.y;
blendDQ[1] += yc*Bones[viBlendIndices.y * 2 +1] * vBlendWeights.y;

blendDQ[0] += zc*Bones[viBlendIndices.z * 2] * vBlendWeights.z;
blendDQ[1] += zc*Bones[viBlendIndices.z * 2 +1] * vBlendWeights.z;

blendDQ[0] += wc*Bones[viBlendIndices.w * 2] * vBlendWeights.w;
blendDQ[1] += wc*Bones[viBlendIndices.w * 2 +1] * vBlendWeights.w;

The blended dual quaternion (blendDQ) is then converted to a matrix by the dualQuatToMatrix function, which is defined as follows:

mat4 dualQuatToMatrix(vec4 Qn, vec4 Qd) {  
  mat4 M;
  float len2 = dot(Qn, Qn);
  float w = Qn.w, x = Qn.x, y = Qn.y, z = Qn.z;
  float t0 = Qd.w, t1 = Qd.x, t2 = Qd.y, t3 = Qd.z;

  M[0][0] = w*w + x*x - y*y - z*z;
  M[0][1] = 2 * x * y + 2 * w * z;
  M[0][2] = 2 * x * z - 2 * w * y;
  M[0][3] = 0;

  M[1][0] = 2 * x * y - 2 * w * z;
  M[1][1] = w * w + y * y - x * x - z * z;
  M[1][2] = 2 * y * z + 2 * w * x;
  M[1][3] = 0;

  M[2][0] = 2 * x * z + 2 * w * y;
  M[2][1] = 2 * y * z - 2 * w * x;
  M[2][2] = w * w + z * z - x * x - y * y;
  M[2][3] = 0;

  M[3][0] = -2 * t0 * x + 2 * w * t1 - 2 * t2 * z + 2 * y * t3;
  M[3][1] = -2 * t0 * y + 2 * t1 * z - 2 * x * t3 + 2 * w * t2;
  M[3][2] = -2 * t0 * z + 2 * x * t2 + 2 * w * t3 - 2 * t1 * y;
  M[3][3] = len2;

  M /= len2;

  return M;  
}

The returned matrix is then multiplied with the given vertex/normal, and then the eye space position/normal and texture coordinates are obtained. Finally, the clip space position is calculated:

mat4 skinTransform = dualQuatToMatrix(blendDQ[0], blendDQ[1]);
blendVertex = skinTransform*vec4(vVertex,1);
blendNormal = (skinTransform*vec4(vNormal,0)).xyz;
vEyeSpacePosition = (MV*blendVertex).xyz; 
vEyeSpaceNormal   = N*blendNormal;  
vUVout=vUV; 
gl_Position = P*vec4(vEyeSpacePosition,1);

The fragment shader follows the similar steps as it did in the previous recipe to output the lit textured fragments.

There's more…

The demo application for this recipe renders the dwarf_anim.ezm skeletal model. Even with extreme rotation at the shoulder joint, the output does not suffer from candy wrapper artefacts as shown in the following figure:

On the other hand, if we use the matrix palette skinning, we get the following output, which clearly shows the candy wrapper artefacts:

How it works…

The only difference in this recipe and the previous recipe is the creation of a dual quaternion from the skinning matrix on the CPU, and its conversion back to a matrix in the vertex shader. After we have obtained the skinning matrices, we convert them into a dual quaternion array by using the dual_quat::QuatTrans2UDQ function that gets a dual quaternion from a rotation quaternion and a translation vector. This function is defined as follows in the dual_quat class (in Chapter8/DualQuaternionSkinning/main.cpp):

void QuatTrans2UDQ(const glm::quat& q0, const glm::vec3& t) {
  ordinary = q0;
  dual.w = -0.5f * ( t.x * q0.x + t.y * q0.y + t.z * q0.z);
  dual.x =  0.5f * ( t.x * q0.w + t.y * q0.z - t.z * q0.y);
  dual.y =  0.5f * (-t.x * q0.z + t.y * q0.w + t.z * q0.x);
  dual.z =  0.5f * ( t.x * q0.y - t.y * q0.x + t.z * q0.w);
}

The dual quaternion array is then passed to the shader instead of the bone matrices. In the vertex shader, we first do a dot product of the ordinary quaternion with the dual quaternion. If the dot product of the two quaternions is less than zero, it means they are both facing in opposite direction. We thus subtract the quaternion from the blended dual quaternion, otherwise we add it to the blended dual quaternion:

float yc = 1.0, zc = 1.0, wc = 1.0;

if (dot(Bones[viBlendIndices.x * 2], 
        Bones[viBlendIndices.y * 2]) < 0.0)
   yc = -1.0;

if (dot(Bones[viBlendIndices.x * 2], 
        Bones[viBlendIndices.z * 2]) < 0.0)
   zc = -1.0;

if (dot(Bones[viBlendIndices.x * 2], 
        Bones[viBlendIndices.w * 2]) < 0.0)
   wc = -1.0;

blendDQ[0] = Bones[viBlendIndices.x * 2] * vBlendWeights.x;
blendDQ[1] = Bones[viBlendIndices.x * 2 + 1] * vBlendWeights.x;

blendDQ[0] += yc*Bones[viBlendIndices.y * 2] * vBlendWeights.y;
blendDQ[1] += yc*Bones[viBlendIndices.y * 2 +1] * vBlendWeights.y;

blendDQ[0] += zc*Bones[viBlendIndices.z * 2] * vBlendWeights.z;
blendDQ[1] += zc*Bones[viBlendIndices.z * 2 +1] * vBlendWeights.z;

blendDQ[0] += wc*Bones[viBlendIndices.w * 2] * vBlendWeights.w;
blendDQ[1] += wc*Bones[viBlendIndices.w * 2 +1] * vBlendWeights.w;

The blended dual quaternion (blendDQ) is then converted to a matrix by the dualQuatToMatrix function, which is defined as follows:

mat4 dualQuatToMatrix(vec4 Qn, vec4 Qd) {  
  mat4 M;
  float len2 = dot(Qn, Qn);
  float w = Qn.w, x = Qn.x, y = Qn.y, z = Qn.z;
  float t0 = Qd.w, t1 = Qd.x, t2 = Qd.y, t3 = Qd.z;

  M[0][0] = w*w + x*x - y*y - z*z;
  M[0][1] = 2 * x * y + 2 * w * z;
  M[0][2] = 2 * x * z - 2 * w * y;
  M[0][3] = 0;

  M[1][0] = 2 * x * y - 2 * w * z;
  M[1][1] = w * w + y * y - x * x - z * z;
  M[1][2] = 2 * y * z + 2 * w * x;
  M[1][3] = 0;

  M[2][0] = 2 * x * z + 2 * w * y;
  M[2][1] = 2 * y * z - 2 * w * x;
  M[2][2] = w * w + z * z - x * x - y * y;
  M[2][3] = 0;

  M[3][0] = -2 * t0 * x + 2 * w * t1 - 2 * t2 * z + 2 * y * t3;
  M[3][1] = -2 * t0 * y + 2 * t1 * z - 2 * x * t3 + 2 * w * t2;
  M[3][2] = -2 * t0 * z + 2 * x * t2 + 2 * w * t3 - 2 * t1 * y;
  M[3][3] = len2;

  M /= len2;

  return M;  
}

The returned matrix is then multiplied with the given vertex/normal, and then the eye space position/normal and texture coordinates are obtained. Finally, the clip space position is calculated:

mat4 skinTransform = dualQuatToMatrix(blendDQ[0], blendDQ[1]);
blendVertex = skinTransform*vec4(vVertex,1);
blendNormal = (skinTransform*vec4(vNormal,0)).xyz;
vEyeSpacePosition = (MV*blendVertex).xyz; 
vEyeSpaceNormal   = N*blendNormal;  
vUVout=vUV; 
gl_Position = P*vec4(vEyeSpacePosition,1);

The fragment shader follows the similar steps as it did in the previous recipe to output the lit textured fragments.

There's more…

The demo application for this recipe renders the dwarf_anim.ezm skeletal model. Even with extreme rotation at the shoulder joint, the output does not suffer from candy wrapper artefacts as shown in the following figure:

On the other hand, if we use the matrix palette skinning, we get the following output, which clearly shows the candy wrapper artefacts:

There's more…

The demo application for this recipe renders the dwarf_anim.ezm skeletal model. Even with extreme rotation at the shoulder joint, the output does not suffer from candy wrapper artefacts as shown in the following figure:

On the other hand, if we use the matrix palette skinning, we get the following output, which clearly shows the candy wrapper artefacts:

Modeling cloth using transform feedback

In this recipe we will use the transform feedback mechanism of the modern GPU to model cloth. Transform feedback is a special mode of modern GPU in which the vertex shader can directly output to a buffer object. This allows developers to do complex computations without affecting the rest of the rendering pipeline. We will elaborate how to use this mechanism to simulate cloth entirely on the GPU.

From the implementation point of view in modern OpenGL, transform feedback exists as an OpenGL object similar to textures. Working with transform feedback object requires two steps: first, generation of transform feedback with specification of shader outputs, and second, usage of the transform feedback for simulation and rendering. We generate it by calling the glGetTransformFeedbacks function and passing it the number of objects and the variable to store the returned IDs. After the object is created, it is bound to the current OpenGL context by calling glBindTransformFeedback, and its only parameter is the ID of the transform feedback object we are interested to bind.

Next, we need to register the vertex attributes that we want to record in a transform feedback buffer. This is done through the glTransformFeedbackVaryings function. The parameters this function requires are in the following order: the shader program object, the number of outputs from the shader, the names of the attributes, and the recording mode. Recording mode can be either GL_INTERLEAVED_ATTRIBS (which means that the attributes will all be stored in a single interleaved buffer object) or GL_SEPARATE_ATTRIBS (which means each attribute will be stored in its own buffer object). Note that the shader program has to be relinked after the shader output varyings are specified.

We also have to set up our buffer objects that are going to store the attributes' output through transform feedback. At the rendering stage, we first set up our shader and the required uniforms. Then, we bind out vertex array objects storing out buffer object binding. Next, we bind the buffer object for transform feedback by calling the glBindBufferBase function. The first parameter is the index and the second parameter is the buffer object ID, which will store the shader output attribute. We can bind as many objects as we need, but the total calls to this function must be at least equal to the total output attributes from the vertex shader. Once the buffers are bound, we can initiate transform feedback by issuing a call to glBeginTransformFeedback and the parameter to this function is the output primitive type. We then issue our glDraw* call and then call glEndTransformFeedback.

Tip

OpenGL 4.0 and above provide a very convenient function, glDrawTransformFeedback. We just give it out primitive type and it automatically renders our primitives based on the total number of outputs from the vertex shader. In addition, OpenGL 4.0 provides the ability to pause/resume the transform feedback object as well as outputting to multiple transform feedback streams.

For the cloth simulation implementation using transform feedback, this is how we proceed. We store the current and previous position of the cloth vertices into a pair of buffer objects. To have convenient access to the buffer objects, we store these into a pair of vertex array objects. Then in order to deform the cloth, we run a vertex shader that inputs the current and previous positions from the buffer objects. In the vertex shader, the internal and external forces are calculated for each pair of cloth vertices and then acceleration is calculated. Using Verlet integration, the new vertex position is obtained. The new and previous positions are output from the vertex shader, so they are written out to the attached transform feedback buffers. Since we have a pair of vertex array objects, we ping pong between the two. This process is continued and the simulation proceeds forward.

The whole process is well summarized by the following figure:

More details of the inner workings of this method are detailed in the reference in the See also section.

Getting ready

The code for this recipe is contained in the Chapter8/TransformfeedbackCloth folder.

How to do it…

Let us start the recipe by following these simple steps:

Generate the geometry and topology for a piece of cloth by creating a set of points and their connectivity. Bind this data to a buffer object. The vectors X and X_last store the current and last position respectively, and the vector F stores the force for each vertex:

   
vector<GLushort> indices; 
vector<glm::vec4> X;  
vector<glm::vec4> X_last; 
vector<glm::vec3> F;

indices.resize( numX*numY*2*3);  
   X.resize(total_points);
   X_last.resize(total_points); 
   F.resize(total_points);
   for(int j=0;j<=numY;j++) {     
      for(int i=0;i<=numX;i++) {   
         X[count] = glm::vec4( ((float(i)/(u-1)) *2-1)* hsize, 
         sizeX+1, ((float(j)/(v-1) )* sizeY),1);
         X_last[count] = X[count];
      count++;
      }
   } 
   GLushort* id=&indices[0];
   for (int i = 0; i < numY; i++) {        
      for (int j = 0; j < numX; j++) {            
         int i0 = i * (numX+1) + j;            
         int i1 = i0 + 1;            
         int i2 = i0 + (numX+1);            
         int i3 = i2 + 1;            
         if ((j+i)%2) {                
            *id++ = i0; *id++ = i2; *id++ = i1;                
            *id++ = i1; *id++ = i2; *id++ = i3;            
         } else {                
            *id++ = i0; *id++ = i2; *id++ = i3;                
            *id++ = i0; *id++ = i3; *id++ = i1;            
         }
      }
    }
    glGenVertexArrays(1, &clothVAOID);
   glGenBuffers (1, &clothVBOVerticesID);
   glGenBuffers (1, &clothVBOIndicesID);
   glBindVertexArray(clothVAOID);
   glBindBuffer (GL_ARRAY_BUFFER, clothVBOVerticesID);
   glBufferData (GL_ARRAY_BUFFER, sizeof(float)*4*X.size(),  
    &X[0].x, GL_STATIC_DRAW);
   glEnableVertexAttribArray(0);
   glVertexAttribPointer (0, 4, GL_FLOAT, GL_FALSE,0,0);
   glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, clothVBOIndicesID);
   glBufferData(GL_ELEMENT_ARRAY_BUFFER, 
    sizeof(GLushort)*indices.size(), &indices[0], GL_STATIC_DRAW);
   glBindVertexArray(0);

Create two pairs of vertex array objects (VAO), one pair for rendering and another pair for update of points. Bind two buffer objects (containing current positions and previous positions) to the update VAO, and one buffer object (containing current positions) to the render VAO. Also attach an element array buffer for geometry indices. Set the buffer object usage parameter as GL_DYNAMIC_COPY). This usage parameter hints to the GPU that the contents of the buffer object will be frequently changed, and it will be read in OpenGL or used as a source for GL commands:

   glGenVertexArrays(2, vaoUpdateID);
   glGenVertexArrays(2, vaoRenderID);
   glGenBuffers( 2, vboID_Pos);
   glGenBuffers( 2, vboID_PrePos);
  for(int i=0;i<2;i++) {
    glBindVertexArray(vaoUpdateID[i]);
    glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
    glBufferData( GL_ARRAY_BUFFER, X.size()* sizeof(glm::vec4), 
       &(X[0].x), GL_DYNAMIC_COPY);    
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
     glBindBuffer( GL_ARRAY_BUFFER, vboID_PrePos[i]);
    glBufferData( GL_ARRAY_BUFFER,  
      X_last.size()*sizeof(glm::vec4), &(X_last[0].x),  
      GL_DYNAMIC_COPY);  
    glEnableVertexAttribArray(1);
    glVertexAttribPointer(1,  4, GL_FLOAT, GL_FALSE, 0,0);   
  }
   //set render vao
  for(int i=0;i<2;i++) {
    glBindVertexArray(vaoRenderID[i]);
    glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, vboIndices);
    if(i==0)
      glBufferData(GL_ELEMENT_ARRAY_BUFFER,    
      indices.size()*sizeof(GLushort), &indices[0], 
      GL_STATIC_DRAW);
  }

For ease of access in the vertex shader, bind the current and previous position buffer objects to a set of buffer textures. The buffer textures are one dimensional textures that are created like normal OpenGL textures using the glGenTextures call, but they are bound to the GL_TEXTURE_BUFFER target. They provide read access to the entire buffer object memory in the vertex shader. The data is accessed in the vertex shader using the
texelFetchBuffer function:
```
   for(int i=0;i<2;i++) {
      glBindTexture( GL_TEXTURE_BUFFER, texPosID[i]);
      glTexBuffer( GL_TEXTURE_BUFFER, GL_RGBA32F, vboID_Pos[i]);
      glBindTexture( GL_TEXTURE_BUFFER, texPrePosID[i]);
      glTexBuffer(GL_TEXTURE_BUFFER, GL_RGBA32F, vboID_PrePos[i]);
   }
```

Generate a transform feedback object and pass the attribute names that will be output from our deformation vertex shader. Make sure to relink the program.

  glGenTransformFeedbacks(1, &tfID);
  glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, tfID); 
  const char* varying_names[]={"out_position_mass", 
   "out_prev_position"};  
  glTransformFeedbackVaryings(massSpringShader.GetProgram(), 2, 
   varying_names, GL_SEPARATE_ATTRIBS);    
  glLinkProgram(massSpringShader.GetProgram());

In the rendering function, bind the cloth deformation shader (Chapter8/TransformFeedbackCloth/shaders/Spring.vert) and then run a loop. In each loop iteration, bind the texture buffers, and then bind the update vertex array object. At the same time, bind the previous buffer objects as the transform feedback buffers. These will store the output from the vertex shader. Disable the rasterizer, begin the transform feedback mode, and then draw the entire set of cloth vertices. Use the ping pong approach to swap the read/write pathways:

massSpringShader.Use(); 
glUniformMatrix4fv(massSpringShader("MVP"), 1, GL_FALSE, 
glm::value_ptr(mMVP));    
for(int i=0;i<NUM_ITER;i++) {
    glActiveTexture( GL_TEXTURE0);
    glBindTexture( GL_TEXTURE_BUFFER, texPosID[writeID]);
    glActiveTexture( GL_TEXTURE1);
    glBindTexture( GL_TEXTURE_BUFFER, texPrePosID[writeID]);
    glBindVertexArray( vaoUpdateID[writeID]);
    glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0,
    vboID_Pos[readID]);
    glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1,
    vboID_PrePos[readID]);
    glEnable(GL_RASTERIZER_DISCARD);    // disable rasrization
    glBeginQuery(GL_TIME_ELAPSED,t_query);
    glBeginTransformFeedback(GL_POINTS);
    glDrawArrays(GL_POINTS, 0, total_points);
    glEndTransformFeedback();
    glEndQuery(GL_TIME_ELAPSED);
    glFlush();
    glDisable(GL_RASTERIZER_DISCARD);
    int tmp = readID;
    readID=writeID;
    writeID = tmp;
}
glGetQueryObjectui64v(t_query, GL_QUERY_RESULT, &elapsed_time);     
delta_time = elapsed_time / 1000000.0f;  
massSpringShader.UnUse();

After the loop is terminated, bind the render VAO that renders the cloth geometry and vertices:

   glBindVertexArray(vaoRenderID[writeID]);
   glDisable(GL_DEPTH_TEST);
   renderShader.Use();
   glUniformMatrix4fv(renderShader("MVP"), 1, GL_FALSE,    
   glm::value_ptr(mMVP));
   glDrawElements(GL_TRIANGLES, indices.size(), 
   GL_UNSIGNED_SHORT,0);
   renderShader.UnUse();
   glEnable(GL_DEPTH_TEST);
   if(bDisplayMasses) {
      particleShader.Use();
      glUniform1i(particleShader("selected_index"),   
      selected_index);
      glUniformMatrix4fv(particleShader("MV"), 1, GL_FALSE,    
      glm::value_ptr(mMV));  
      glUniformMatrix4fv(particleShader("MVP"), 1, GL_FALSE, 
      glm::value_ptr(mMVP));
      glDrawArrays(GL_POINTS, 0, total_points);    
      particleShader.UnUse();
   }
   glBindVertexArray( 0);

In the vertex shader, obtain the current and previous position of the cloth vertex. If the vertex is a pinned vertex, set its mass to 0 so it would not be simulated; otherwise, add an external force based on gravity. Next loop through all neighbors of the current vertex by looking up the texture buffer and estimate the internal force:

   float m = position_mass.w;
   vec3 pos = position_mass.xyz;
   vec3 pos_old = prev_position.xyz;  
   vec3 vel = (pos - pos_old) / dt;
   float ks=0, kd=0;
   int index = gl_VertexID;
   int ix = index % texsize_x;
   int iy = index / texsize_x;
   if(index ==0 || index == (texsize_x-1))   
      m = 0;
   vec3 F = gravity*m + (DEFAULT_DAMPING*vel);
   for(int k=0;k<12;k++) {
      ivec2 coord = getNextNeighbor(k, ks, kd);
      int j = coord.x;
      int i = coord.y;    
      if (((iy + i) < 0) || ((iy + i) > (texsize_y-1)))
         continue;
      if (((ix + j) < 0) || ((ix + j) > (texsize_x-1)))
         continue;
      int index_neigh = (iy + i) * texsize_x + ix + j;
      vec3 p2 = texelFetchBuffer(tex_position_mass,   
      index_neigh).xyz;
      vec3 p2_last = texelFetchBuffer(tex_prev_position_mass,  
      index_neigh).xyz;
      vec2 coord_neigh = vec2(ix + j, iy + i)*step;
      float rest_length = length(coord*inv_cloth_size);
      vec3 v2 = (p2- p2_last)/dt;
      vec3 deltaP = pos - p2;  
      vec3 deltaV = vel - v2;   
      float dist = length(deltaP);
      float   leftTerm = -ks * (dist-rest_length);
      float  rightTerm = kd * (dot(deltaV, deltaP)/dist);    
      vec3 springForce = (leftTerm + rightTerm)* 
      normalize(deltaP);
      F +=  springForce;  
   }

Using the combined force, calculate the acceleration and then estimate the new position using Verlet integration. Output the appropriate attribute from the shader:

  vec3 acc = vec3(0);
  if(m!=0)
     acc = F/m; 
  vec3 tmp = pos; 
  pos = pos * 2.0 - pos_old + acc* dt * dt;
  pos_old = tmp; 
  pos.y=max(0, pos.y); 
  out_position_mass = vec4(pos, m);  
  out_prev_position = vec4(pos_old,m);        
  gl_Position = MVP*vec4(pos, 1);

How it works…

There are two parts of this recipe, the generation of geometry and identifying output attributes for transform feedback buffers. We first generate the cloth geometry and then associate our buffer objects. To enable easier access of current and previous positions, we bind the position buffer objects as texture buffers.

To enable deformation, we first bind our deformation shader and the update VAO. Next, we specify the transform feedback buffers that receive the output from the vertex shader. We disable the rasterizer to prevent the execution of the rest of the pipeline. Next, we begin the transform feedback mode, render our vertices, and then end the transform feedback mode. This invokes one step of the integration. To enable more steps, we use a ping pong strategy by binding the currently written buffer object as the read point for the next iteration.

The actual deformation is carried out in the vertex shader (Chapter8/TransformFeedbackCloth/shaders/Spring.vert). We first determine the current and previous positions. The velocity is then determined. The current vertex ID (gl_VertexID) is used to determine the linear index of the current vertex. This is a unique index of each vertex and can be used by a vertex shader. We use it here to determine if the current vertex is a pinned vertex. If so, the mass of 0 is assigned to it which makes this vertex immovable:

float m = position_mass.w;
vec3 pos = position_mass.xyz;
vec3 pos_old = prev_position.xyz;  
vec3 vel = (pos - pos_old) / dt;
float ks=0, kd=0;
int index = gl_VertexID;
int ix = index % texsize_x;
int iy = index / texsize_x;
if(index ==0 || index == (texsize_x-1))   
   m = 0;

Next, the acceleration due to gravity and velocity damping force is applied. After this, a loop is run which basically loops through all of the neighbors of the current vertex and estimates the net internal (spring) force. This force is then added to the combined force for the current vertex:

vec3 F = gravity*m + (DEFAULT_DAMPING*vel);
  
for(int k=0;k<12;k++) {
  ivec2 coord = getNextNeighbor(k, ks, kd);
  int j = coord.x;
  int i = coord.y;    
  if (((iy + i) < 0) || ((iy + i) > (texsize_y-1)))
    continue;
  if (((ix + j) < 0) || ((ix + j) > (texsize_x-1)))
    continue;
  int index_neigh = (iy + i) * texsize_x + ix + j;
  vec3 p2 = texelFetchBuffer(tex_position_mass, 
  index_neigh).xyz;
  vec3 p2_last = texelFetchBuffer(tex_prev_position_mass,   
  index_neigh).xyz;
  vec2 coord_neigh = vec2(ix + j, iy + i)*step;
  float rest_length = length(coord*inv_cloth_size);
  vec3 v2 = (p2- p2_last)/dt;
  vec3 deltaP = pos - p2;  
  vec3 deltaV = vel - v2;   
  float dist = length(deltaP);
  float   leftTerm = -ks * (dist-rest_length);
  float  rightTerm = kd * (dot(deltaV, deltaP)/dist);    
  vec3 springForce = (leftTerm + rightTerm)* normalize(deltaP);
  F +=  springForce;  
}

From the net force, the acceleration is first obtained and then the new position is obtained using Verlet integration. Finally, the collision with the ground plane is determined by looking at the Y value. We end the shader by outputting the output attributes (out_position and out_prev_position), which are then stored into the buffer objects bound as the transform feedback buffers:

  vec3 acc = vec3(0);
  if(m!=0)
     acc = F/m;  
  vec3 tmp = pos; 
  pos = pos * 2.0 - pos_old + acc* dt * dt;
  pos_old = tmp; 
  pos.y=max(0, pos.y);  
  out_position_mass = vec4(pos, m);  
  out_prev_position = vec4(pos_old,m);        
  gl_Position = MVP*vec4(pos, 1);

The shader, along with the transform feedback mechanism, proceeds to deform all of the cloth vertices and in the end, we get the cloth vertices deformed.

There's more…

The demo application implementing this recipe shows the piece of cloth falling under gravity. Several frames from the deformation are shown in the following figure. Using the left mouse button, we can pick the cloth vertices and move them around.

In this recipe we only output to a single stream. We can attach more than one stream and store results in separate buffer objects. In addition, we can have several transform feedback objects and we can pause/resume them as required.

How to do it…

Let us start the recipe by following these simple steps:

Generate the geometry and topology for a piece of cloth by creating a set of points and their connectivity. Bind this data to a buffer object. The vectors X and X_last store the current and last position respectively, and the vector F stores the force for each vertex:

   
vector<GLushort> indices; 
vector<glm::vec4> X;  
vector<glm::vec4> X_last; 
vector<glm::vec3> F;

indices.resize( numX*numY*2*3);  
   X.resize(total_points);
   X_last.resize(total_points); 
   F.resize(total_points);
   for(int j=0;j<=numY;j++) {     
      for(int i=0;i<=numX;i++) {   
         X[count] = glm::vec4( ((float(i)/(u-1)) *2-1)* hsize, 
         sizeX+1, ((float(j)/(v-1) )* sizeY),1);
         X_last[count] = X[count];
      count++;
      }
   } 
   GLushort* id=&indices[0];
   for (int i = 0; i < numY; i++) {        
      for (int j = 0; j < numX; j++) {            
         int i0 = i * (numX+1) + j;            
         int i1 = i0 + 1;            
         int i2 = i0 + (numX+1);            
         int i3 = i2 + 1;            
         if ((j+i)%2) {                
            *id++ = i0; *id++ = i2; *id++ = i1;                
            *id++ = i1; *id++ = i2; *id++ = i3;            
         } else {                
            *id++ = i0; *id++ = i2; *id++ = i3;                
            *id++ = i0; *id++ = i3; *id++ = i1;            
         }
      }
    }
    glGenVertexArrays(1, &clothVAOID);
   glGenBuffers (1, &clothVBOVerticesID);
   glGenBuffers (1, &clothVBOIndicesID);
   glBindVertexArray(clothVAOID);
   glBindBuffer (GL_ARRAY_BUFFER, clothVBOVerticesID);
   glBufferData (GL_ARRAY_BUFFER, sizeof(float)*4*X.size(),  
    &X[0].x, GL_STATIC_DRAW);
   glEnableVertexAttribArray(0);
   glVertexAttribPointer (0, 4, GL_FLOAT, GL_FALSE,0,0);
   glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, clothVBOIndicesID);
   glBufferData(GL_ELEMENT_ARRAY_BUFFER, 
    sizeof(GLushort)*indices.size(), &indices[0], GL_STATIC_DRAW);
   glBindVertexArray(0);

Create two pairs of vertex array objects (VAO), one pair for rendering and another pair for update of points. Bind two buffer objects (containing current positions and previous positions) to the update VAO, and one buffer object (containing current positions) to the render VAO. Also attach an element array buffer for geometry indices. Set the buffer object usage parameter as GL_DYNAMIC_COPY). This usage parameter hints to the GPU that the contents of the buffer object will be frequently changed, and it will be read in OpenGL or used as a source for GL commands:

   glGenVertexArrays(2, vaoUpdateID);
   glGenVertexArrays(2, vaoRenderID);
   glGenBuffers( 2, vboID_Pos);
   glGenBuffers( 2, vboID_PrePos);
  for(int i=0;i<2;i++) {
    glBindVertexArray(vaoUpdateID[i]);
    glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
    glBufferData( GL_ARRAY_BUFFER, X.size()* sizeof(glm::vec4), 
       &(X[0].x), GL_DYNAMIC_COPY);    
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
     glBindBuffer( GL_ARRAY_BUFFER, vboID_PrePos[i]);
    glBufferData( GL_ARRAY_BUFFER,  
      X_last.size()*sizeof(glm::vec4), &(X_last[0].x),  
      GL_DYNAMIC_COPY);  
    glEnableVertexAttribArray(1);
    glVertexAttribPointer(1,  4, GL_FLOAT, GL_FALSE, 0,0);   
  }
   //set render vao
  for(int i=0;i<2;i++) {
    glBindVertexArray(vaoRenderID[i]);
    glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, vboIndices);
    if(i==0)
      glBufferData(GL_ELEMENT_ARRAY_BUFFER,    
      indices.size()*sizeof(GLushort), &indices[0], 
      GL_STATIC_DRAW);
  }

For ease of access in the vertex shader, bind the current and previous position buffer objects to a set of buffer textures. The buffer textures are one dimensional textures that are created like normal OpenGL textures using the glGenTextures call, but they are bound to the GL_TEXTURE_BUFFER target. They provide read access to the entire buffer object memory in the vertex shader. The data is accessed in the vertex shader using the
texelFetchBuffer function:
```
   for(int i=0;i<2;i++) {
      glBindTexture( GL_TEXTURE_BUFFER, texPosID[i]);
      glTexBuffer( GL_TEXTURE_BUFFER, GL_RGBA32F, vboID_Pos[i]);
      glBindTexture( GL_TEXTURE_BUFFER, texPrePosID[i]);
      glTexBuffer(GL_TEXTURE_BUFFER, GL_RGBA32F, vboID_PrePos[i]);
   }
```

Generate a transform feedback object and pass the attribute names that will be output from our deformation vertex shader. Make sure to relink the program.

  glGenTransformFeedbacks(1, &tfID);
  glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, tfID); 
  const char* varying_names[]={"out_position_mass", 
   "out_prev_position"};  
  glTransformFeedbackVaryings(massSpringShader.GetProgram(), 2, 
   varying_names, GL_SEPARATE_ATTRIBS);    
  glLinkProgram(massSpringShader.GetProgram());

In the rendering function, bind the cloth deformation shader (Chapter8/TransformFeedbackCloth/shaders/Spring.vert) and then run a loop. In each loop iteration, bind the texture buffers, and then bind the update vertex array object. At the same time, bind the previous buffer objects as the transform feedback buffers. These will store the output from the vertex shader. Disable the rasterizer, begin the transform feedback mode, and then draw the entire set of cloth vertices. Use the ping pong approach to swap the read/write pathways:

massSpringShader.Use(); 
glUniformMatrix4fv(massSpringShader("MVP"), 1, GL_FALSE, 
glm::value_ptr(mMVP));    
for(int i=0;i<NUM_ITER;i++) {
    glActiveTexture( GL_TEXTURE0);
    glBindTexture( GL_TEXTURE_BUFFER, texPosID[writeID]);
    glActiveTexture( GL_TEXTURE1);
    glBindTexture( GL_TEXTURE_BUFFER, texPrePosID[writeID]);
    glBindVertexArray( vaoUpdateID[writeID]);
    glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0,
    vboID_Pos[readID]);
    glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1,
    vboID_PrePos[readID]);
    glEnable(GL_RASTERIZER_DISCARD);    // disable rasrization
    glBeginQuery(GL_TIME_ELAPSED,t_query);
    glBeginTransformFeedback(GL_POINTS);
    glDrawArrays(GL_POINTS, 0, total_points);
    glEndTransformFeedback();
    glEndQuery(GL_TIME_ELAPSED);
    glFlush();
    glDisable(GL_RASTERIZER_DISCARD);
    int tmp = readID;
    readID=writeID;
    writeID = tmp;
}
glGetQueryObjectui64v(t_query, GL_QUERY_RESULT, &elapsed_time);     
delta_time = elapsed_time / 1000000.0f;  
massSpringShader.UnUse();

After the loop is terminated, bind the render VAO that renders the cloth geometry and vertices:

   glBindVertexArray(vaoRenderID[writeID]);
   glDisable(GL_DEPTH_TEST);
   renderShader.Use();
   glUniformMatrix4fv(renderShader("MVP"), 1, GL_FALSE,    
   glm::value_ptr(mMVP));
   glDrawElements(GL_TRIANGLES, indices.size(), 
   GL_UNSIGNED_SHORT,0);
   renderShader.UnUse();
   glEnable(GL_DEPTH_TEST);
   if(bDisplayMasses) {
      particleShader.Use();
      glUniform1i(particleShader("selected_index"),   
      selected_index);
      glUniformMatrix4fv(particleShader("MV"), 1, GL_FALSE,    
      glm::value_ptr(mMV));  
      glUniformMatrix4fv(particleShader("MVP"), 1, GL_FALSE, 
      glm::value_ptr(mMVP));
      glDrawArrays(GL_POINTS, 0, total_points);    
      particleShader.UnUse();
   }
   glBindVertexArray( 0);

In the vertex shader, obtain the current and previous position of the cloth vertex. If the vertex is a pinned vertex, set its mass to 0 so it would not be simulated; otherwise, add an external force based on gravity. Next loop through all neighbors of the current vertex by looking up the texture buffer and estimate the internal force:

   float m = position_mass.w;
   vec3 pos = position_mass.xyz;
   vec3 pos_old = prev_position.xyz;  
   vec3 vel = (pos - pos_old) / dt;
   float ks=0, kd=0;
   int index = gl_VertexID;
   int ix = index % texsize_x;
   int iy = index / texsize_x;
   if(index ==0 || index == (texsize_x-1))   
      m = 0;
   vec3 F = gravity*m + (DEFAULT_DAMPING*vel);
   for(int k=0;k<12;k++) {
      ivec2 coord = getNextNeighbor(k, ks, kd);
      int j = coord.x;
      int i = coord.y;    
      if (((iy + i) < 0) || ((iy + i) > (texsize_y-1)))
         continue;
      if (((ix + j) < 0) || ((ix + j) > (texsize_x-1)))
         continue;
      int index_neigh = (iy + i) * texsize_x + ix + j;
      vec3 p2 = texelFetchBuffer(tex_position_mass,   
      index_neigh).xyz;
      vec3 p2_last = texelFetchBuffer(tex_prev_position_mass,  
      index_neigh).xyz;
      vec2 coord_neigh = vec2(ix + j, iy + i)*step;
      float rest_length = length(coord*inv_cloth_size);
      vec3 v2 = (p2- p2_last)/dt;
      vec3 deltaP = pos - p2;  
      vec3 deltaV = vel - v2;   
      float dist = length(deltaP);
      float   leftTerm = -ks * (dist-rest_length);
      float  rightTerm = kd * (dot(deltaV, deltaP)/dist);    
      vec3 springForce = (leftTerm + rightTerm)* 
      normalize(deltaP);
      F +=  springForce;  
   }

Using the combined force, calculate the acceleration and then estimate the new position using Verlet integration. Output the appropriate attribute from the shader:

  vec3 acc = vec3(0);
  if(m!=0)
     acc = F/m; 
  vec3 tmp = pos; 
  pos = pos * 2.0 - pos_old + acc* dt * dt;
  pos_old = tmp; 
  pos.y=max(0, pos.y); 
  out_position_mass = vec4(pos, m);  
  out_prev_position = vec4(pos_old,m);        
  gl_Position = MVP*vec4(pos, 1);

How it works…

There are two parts of this recipe, the generation of geometry and identifying output attributes for transform feedback buffers. We first generate the cloth geometry and then associate our buffer objects. To enable easier access of current and previous positions, we bind the position buffer objects as texture buffers.

To enable deformation, we first bind our deformation shader and the update VAO. Next, we specify the transform feedback buffers that receive the output from the vertex shader. We disable the rasterizer to prevent the execution of the rest of the pipeline. Next, we begin the transform feedback mode, render our vertices, and then end the transform feedback mode. This invokes one step of the integration. To enable more steps, we use a ping pong strategy by binding the currently written buffer object as the read point for the next iteration.

The actual deformation is carried out in the vertex shader (Chapter8/TransformFeedbackCloth/shaders/Spring.vert). We first determine the current and previous positions. The velocity is then determined. The current vertex ID (gl_VertexID) is used to determine the linear index of the current vertex. This is a unique index of each vertex and can be used by a vertex shader. We use it here to determine if the current vertex is a pinned vertex. If so, the mass of 0 is assigned to it which makes this vertex immovable:

float m = position_mass.w;
vec3 pos = position_mass.xyz;
vec3 pos_old = prev_position.xyz;  
vec3 vel = (pos - pos_old) / dt;
float ks=0, kd=0;
int index = gl_VertexID;
int ix = index % texsize_x;
int iy = index / texsize_x;
if(index ==0 || index == (texsize_x-1))   
   m = 0;

Next, the acceleration due to gravity and velocity damping force is applied. After this, a loop is run which basically loops through all of the neighbors of the current vertex and estimates the net internal (spring) force. This force is then added to the combined force for the current vertex:

vec3 F = gravity*m + (DEFAULT_DAMPING*vel);
  
for(int k=0;k<12;k++) {
  ivec2 coord = getNextNeighbor(k, ks, kd);
  int j = coord.x;
  int i = coord.y;    
  if (((iy + i) < 0) || ((iy + i) > (texsize_y-1)))
    continue;
  if (((ix + j) < 0) || ((ix + j) > (texsize_x-1)))
    continue;
  int index_neigh = (iy + i) * texsize_x + ix + j;
  vec3 p2 = texelFetchBuffer(tex_position_mass, 
  index_neigh).xyz;
  vec3 p2_last = texelFetchBuffer(tex_prev_position_mass,   
  index_neigh).xyz;
  vec2 coord_neigh = vec2(ix + j, iy + i)*step;
  float rest_length = length(coord*inv_cloth_size);
  vec3 v2 = (p2- p2_last)/dt;
  vec3 deltaP = pos - p2;  
  vec3 deltaV = vel - v2;   
  float dist = length(deltaP);
  float   leftTerm = -ks * (dist-rest_length);
  float  rightTerm = kd * (dot(deltaV, deltaP)/dist);    
  vec3 springForce = (leftTerm + rightTerm)* normalize(deltaP);
  F +=  springForce;  
}

From the net force, the acceleration is first obtained and then the new position is obtained using Verlet integration. Finally, the collision with the ground plane is determined by looking at the Y value. We end the shader by outputting the output attributes (out_position and out_prev_position), which are then stored into the buffer objects bound as the transform feedback buffers:

  vec3 acc = vec3(0);
  if(m!=0)
     acc = F/m;  
  vec3 tmp = pos; 
  pos = pos * 2.0 - pos_old + acc* dt * dt;
  pos_old = tmp; 
  pos.y=max(0, pos.y);  
  out_position_mass = vec4(pos, m);  
  out_prev_position = vec4(pos_old,m);        
  gl_Position = MVP*vec4(pos, 1);

The shader, along with the transform feedback mechanism, proceeds to deform all of the cloth vertices and in the end, we get the cloth vertices deformed.

There's more…

The demo application implementing this recipe shows the piece of cloth falling under gravity. Several frames from the deformation are shown in the following figure. Using the left mouse button, we can pick the cloth vertices and move them around.

In this recipe we only output to a single stream. We can attach more than one stream and store results in separate buffer objects. In addition, we can have several transform feedback objects and we can pause/resume them as required.

How it works…

There are two parts of this recipe, the generation of geometry and identifying output attributes for transform feedback buffers. We first generate the cloth geometry and then associate our buffer objects. To enable easier access of current and previous positions, we bind the position buffer objects as texture buffers.

To enable deformation, we first bind our deformation shader and the update VAO. Next, we specify the transform feedback buffers that receive the output from the vertex shader. We disable the rasterizer to prevent the execution of the rest of the pipeline. Next, we begin the transform feedback mode, render our vertices, and then end the transform feedback mode. This invokes one step of the integration. To enable more steps, we use a ping pong strategy by binding the currently written buffer object as the read point for the next iteration.

The actual deformation is carried out in the vertex shader (Chapter8/TransformFeedbackCloth/shaders/Spring.vert). We first determine the current and previous positions. The velocity is then determined. The current vertex ID (gl_VertexID) is used to determine the linear index of the current vertex. This is a unique index of each vertex and can be used by a vertex shader. We use it here to determine if the current vertex is a pinned vertex. If so, the mass of 0 is assigned to it which makes this vertex immovable:

float m = position_mass.w;
vec3 pos = position_mass.xyz;
vec3 pos_old = prev_position.xyz;  
vec3 vel = (pos - pos_old) / dt;
float ks=0, kd=0;
int index = gl_VertexID;
int ix = index % texsize_x;
int iy = index / texsize_x;
if(index ==0 || index == (texsize_x-1))   
   m = 0;

Next, the acceleration due to gravity and velocity damping force is applied. After this, a loop is run which basically loops through all of the neighbors of the current vertex and estimates the net internal (spring) force. This force is then added to the combined force for the current vertex:

vec3 F = gravity*m + (DEFAULT_DAMPING*vel);
  
for(int k=0;k<12;k++) {
  ivec2 coord = getNextNeighbor(k, ks, kd);
  int j = coord.x;
  int i = coord.y;    
  if (((iy + i) < 0) || ((iy + i) > (texsize_y-1)))
    continue;
  if (((ix + j) < 0) || ((ix + j) > (texsize_x-1)))
    continue;
  int index_neigh = (iy + i) * texsize_x + ix + j;
  vec3 p2 = texelFetchBuffer(tex_position_mass, 
  index_neigh).xyz;
  vec3 p2_last = texelFetchBuffer(tex_prev_position_mass,   
  index_neigh).xyz;
  vec2 coord_neigh = vec2(ix + j, iy + i)*step;
  float rest_length = length(coord*inv_cloth_size);
  vec3 v2 = (p2- p2_last)/dt;
  vec3 deltaP = pos - p2;  
  vec3 deltaV = vel - v2;   
  float dist = length(deltaP);
  float   leftTerm = -ks * (dist-rest_length);
  float  rightTerm = kd * (dot(deltaV, deltaP)/dist);    
  vec3 springForce = (leftTerm + rightTerm)* normalize(deltaP);
  F +=  springForce;  
}

From the net force, the acceleration is first obtained and then the new position is obtained using Verlet integration. Finally, the collision with the ground plane is determined by looking at the Y value. We end the shader by outputting the output attributes (out_position and out_prev_position), which are then stored into the buffer objects bound as the transform feedback buffers:

  vec3 acc = vec3(0);
  if(m!=0)
     acc = F/m;  
  vec3 tmp = pos; 
  pos = pos * 2.0 - pos_old + acc* dt * dt;
  pos_old = tmp; 
  pos.y=max(0, pos.y);  
  out_position_mass = vec4(pos, m);  
  out_prev_position = vec4(pos_old,m);        
  gl_Position = MVP*vec4(pos, 1);

The shader, along with the transform feedback mechanism, proceeds to deform all of the cloth vertices and in the end, we get the cloth vertices deformed.

There's more…

The demo application implementing this recipe shows the piece of cloth falling under gravity. Several frames from the deformation are shown in the following figure. Using the left mouse button, we can pick the cloth vertices and move them around.

In this recipe we only output to a single stream. We can attach more than one stream and store results in separate buffer objects. In addition, we can have several transform feedback objects and we can pause/resume them as required.

There's more…

The demo application implementing this recipe shows the piece of cloth falling under gravity. Several frames from the deformation are shown in the following figure. Using the left mouse button, we can pick the cloth vertices and move them around.

In this recipe we only output to a single stream. We can attach more than one stream and store results in separate buffer objects. In addition, we can have several transform feedback objects and we can pause/resume them as required.

Implementing collision detection and response on a transform feedback-based cloth model

In this recipe, we will build on top of the previous recipe and add collision detection and response to the cloth model.

Getting ready

The code for this recipe is contained in the Chapter8/TransformFeedbackClothCollision directory. For this recipe, the setup code and rendering code remains the same as in the previous recipe. The only change is the addition of the ellipsoid/sphere collision code.

How to do it…

Let us start this recipe by following these simple steps:

Generate the geometry and topology for a piece of cloth by creating a set of points and their connectivity. Bind this data to a buffer object as in the previous recipe.
Set up a pair of vertex array objects and buffer objects as in the previous recipe. Also attach buffer textures for easier access to the buffer object memory in the vertex shader.

Generate a transform feedback object and pass the attribute names that will be output from our deformation vertex shader. Make sure to relink the program again:

  glGenTransformFeedbacks(1, &tfID);
  glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, tfID); 
  const char* varying_names[]={"out_position_mass", "out_prev_position"};  
  glTransformFeedbackVaryings(massSpringShader.GetProgram(), 2, varying_names, GL_SEPARATE_ATTRIBS);    
  glLinkProgram(massSpringShader.GetProgram());

Generate an ellipsoid object by using a simple 4×4 matrix. Also store the inverse of the ellipsoid's transform. The location of the ellipsoid is stored by the translate matrix, the orientation by the rotate matrix, and the non-uniform scaling by the scale matrix as follows. When applied, the matrices work in the opposite order. The non-uniform scaling causes the sphere to compress in the Z direction first. Then, the rotation orients the ellipsoid such that it is rotated by 45 degrees on the X axis. Finally, the ellipsoid is shifted by 2 units on the Y axis:
```
  ellipsoid = glm::translate(glm::mat4(1),glm::vec3(0,2,0));
  ellipsoid = glm::rotate(ellipsoid, 45.0f ,glm::vec3(1,0,0));
  ellipsoid = glm::scale(ellipsoid,   
   glm::vec3(fRadius,fRadius,fRadius/2));
  inverse_ellipsoid = glm::inverse(ellipsoid);
```
In the rendering function, bind the cloth deformation shader (Chapter8/ TransformFeedbackClothCollision/shaders/Spring.vert) and then run a loop. In each iteration, bind the texture buffers, and then bind the update vertex array object. At the same time, bind the previous buffer objects as the transform feedback buffers. Do the ping pong strategy as in the previous recipe.

After the loop is terminated, bind the render VAO and render the cloth:

   glBindVertexArray(vaoRenderID[writeID]);
   glDisable(GL_DEPTH_TEST);
   renderShader.Use();
   glUniformMatrix4fv(renderShader("MVP"), 1, GL_FALSE,    
   glm::value_ptr(mMVP));
   glDrawElements(GL_TRIANGLES, indices.size(), 
   GL_UNSIGNED_SHORT,0);
   renderShader.UnUse();
   glEnable(GL_DEPTH_TEST);
   if(bDisplayMasses) {
      particleShader.Use();
      glUniform1i(particleShader("selected_index"),   
      selected_index);
      glUniformMatrix4fv(particleShader("MV"), 1, GL_FALSE,    
      glm::value_ptr(mMV));  
      glUniformMatrix4fv(particleShader("MVP"), 1, GL_FALSE, 
      glm::value_ptr(mMVP));
      glDrawArrays(GL_POINTS, 0, total_points);    
      particleShader.UnUse(); }
   glBindVertexArray( 0);

In the vertex shader, obtain the current and previous position of the cloth vertex. If the vertex is a pinned vertex, set its mass to 0 so it would not be simulated, otherwise, add an external force based on gravity. Next, loop through all neighbors of the current vertex by looking up the texture buffer and estimate the internal force:

   float m = position_mass.w;
   vec3 pos = position_mass.xyz;
   vec3 pos_old = prev_position.xyz;  
   vec3 vel = (pos - pos_old) / dt;
   float ks=0, kd=0;
   int index = gl_VertexID;
   int ix = index % texsize_x;
   int iy = index / texsize_x;
   if(index ==0 || index == (texsize_x-1))   
      m = 0;
   vec3 F = gravity*m + (DEFAULT_DAMPING*vel);
   for(int k=0;k<12;k++) {
      ivec2 coord = getNextNeighbor(k, ks, kd);
      int j = coord.x;
      int i = coord.y;    
      if (((iy + i) < 0) || ((iy + i) > (texsize_y-1)))
         continue;
      if (((ix + j) < 0) || ((ix + j) > (texsize_x-1)))
         continue;
      int index_neigh = (iy + i) * texsize_x + ix + j;
      vec3 p2 = texelFetchBuffer(tex_position_mass,   
      index_neigh).xyz;
      vec3 p2_last = texelFetchBuffer(tex_prev_position_mass,  
      index_neigh).xyz;
      vec2 coord_neigh = vec2(ix + j, iy + i)*step;
      float rest_length = length(coord*inv_cloth_size);
      vec3 v2 = (p2- p2_last)/dt;
      vec3 deltaP = pos - p2;  
      vec3 deltaV = vel - v2;   
      float dist = length(deltaP);
      float   leftTerm = -ks * (dist-rest_length);
      float  rightTerm = kd * (dot(deltaV, deltaP)/dist);    
      vec3 springForce = (leftTerm + rightTerm)* 
      normalize(deltaP);
      F +=  springForce;  
   }

Using the combined force, calculate the acceleration and then estimate the new position using Verlet integration. Output the appropriate attribute from the shader:

   vec3 acc = vec3(0);
  if(m!=0)
     acc = F/m; 
  vec3 tmp = pos; 
  pos = pos * 2.0 - pos_old + acc* dt * dt;
  pos_old = tmp; 
  pos.y=max(0, pos.y);

After applying the floor collision, check for collision with an ellipsoid. If there is a collision, modify the position such that the collision is resolved. Finally, output the appropriate attributes from the vertex shader.

  vec4 x0 = inv_ellipsoid*vec4(pos,1); 
  vec3 delta0 = x0.xyz-ellipsoid.xyz;
  float dist2 = dot(delta0, delta0);
  if(dist2<1) {  
    delta0 = (ellipsoid.w - dist2) * delta0 / dist2; 
    vec3 delta;
    vec3 transformInv = vec3(ellipsoid_xform[0].x, 
    ellipsoid_xform[1].x, 
    ellipsoid_xform[2].x);
    transformInv /= dot(transformInv, transformInv);
    delta.x = dot(delta0, transformInv);
    transformInv = vec3(ellipsoid_xform[0].y, 
    ellipsoid_xform[1].y, 
    ellipsoid_xform[2].y);
    transformInv /= dot(transformInv, transformInv);
    delta.y = dot(delta0, transformInv);
    transformInv = vec3(ellipsoid_xform[0].z,     
    ellipsoid_xform[1].z, ellipsoid_xform[2].z);
    transformInv /= dot(transformInv, transformInv);
    delta.z = dot(delta0, transformInv); 
    pos +=  delta ; 
    pos_old = pos;  
  }
  out_position_mass = vec4(pos, m);  
  out_prev_position = vec4(pos_old,m);        
  gl_Position = MVP*vec4(pos, 1);

How it works…

The cloth deformation vertex shader has some additional lines of code to enable collision detection and response. For detection of collision with a plane, we can simply put the current position in the plane equation to find the distance of the current vertex from the plane. If it is less than 0, we have passed through the plane, in which case, we can move the vertex back in the plane's normal direction.

void planeCollision(inout vec3 x,  vec4 plane) {
   float dist = dot(plane.xyz,x)+ plane.w;
   if(dist<0) {
     x += plane.xyz*-dist;
   }
}

Simple geometric primitive, like spheres and ellipsoids, are trivial to handle. In case of collision with the sphere, we check the distance of the current position from the center of the sphere. If this distance is less than the sphere's radius, we have a collision. Once we have a collision, we push the position in the normal direction based on the amount of penetration.

void sphereCollision(inout vec3 x, vec4 sphere)
{
   vec3 delta = x - sphere.xyz;
   float dist = length(delta);
   if (dist < sphere.w) {
       x = sphere.xyz + delta*(sphere.w / dist);
   }
}

Tip

Note that in the preceding calculation, we can avoid the square root altogether by comparing against the squared distance. This can provide significant performance gain when a large number of vertices are there.

For an arbitrarily oriented ellipsoid, we first move the point into the ellipsoid's object space by multiplying with the inverse of the ellipsoid's transform. In this space, the ellipsoid is a unit sphere, hence we can then determine collision by simply looking at the distance between the current vertex and the ellipsoid. If it is less than 1, we have a collision. In this case, we then transform the point to the ellipsoids world space to find the penetration depth. This is then used to displace the current position out in the normal direction.

vec4 x0 = inv_ellipsoid*vec4(pos,1); 
vec3 delta0 = x0.xyz-ellipsoid.xyz;
float dist2 = dot(delta0, delta0);
if(dist2<1) {  
delta0 = (ellipsoid.w - dist2) * delta0 / dist2;
vec3 delta;
vec3 transformInv = vec3(ellipsoid_xform[0].x, ellipsoid_xform[1].x, ellipsoid_xform[2].x);
transformInv /= dot(transformInv, transformInv);
delta.x = dot(delta0, transformInv);
transformInv = vec3(ellipsoid_xform[0].y, ellipsoid_xform[1].y, ellipsoid_xform[2].y);
transformInv /= dot(transformInv, transformInv);
delta.y = dot(delta0, transformInv);
transformInv = vec3(ellipsoid_xform[0].z, ellipsoid_xform[1].z, ellipsoid_xform[2].z);
transformInv /= dot(transformInv, transformInv);
delta.z = dot(delta0, transformInv); 
pos +=  delta ; 
pos_old = pos;  
}

There's more…

The demo application implementing this recipe renders a piece of cloth fixed at two points and is allowed to fall under gravity. In addition, there is an oriented ellipsoid with which the cloth collides as shown in the following figure:

Although we have touched upon basic collision primitives, like spheres, oriented ellipsoids, and plane, more complex primitives can be implemented with the combination of these basic primitives. In addition, polygonal primitives can also be implemented. We leave that as an exercise for the reader.

How to do it…

Let us start this recipe by following these simple steps:

Generate the geometry and topology for a piece of cloth by creating a set of points and their connectivity. Bind this data to a buffer object as in the previous recipe.
Set up a pair of vertex array objects and buffer objects as in the previous recipe. Also attach buffer textures for easier access to the buffer object memory in the vertex shader.

Generate a transform feedback object and pass the attribute names that will be output from our deformation vertex shader. Make sure to relink the program again:

  glGenTransformFeedbacks(1, &tfID);
  glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, tfID); 
  const char* varying_names[]={"out_position_mass", "out_prev_position"};  
  glTransformFeedbackVaryings(massSpringShader.GetProgram(), 2, varying_names, GL_SEPARATE_ATTRIBS);    
  glLinkProgram(massSpringShader.GetProgram());

Generate an ellipsoid object by using a simple 4×4 matrix. Also store the inverse of the ellipsoid's transform. The location of the ellipsoid is stored by the translate matrix, the orientation by the rotate matrix, and the non-uniform scaling by the scale matrix as follows. When applied, the matrices work in the opposite order. The non-uniform scaling causes the sphere to compress in the Z direction first. Then, the rotation orients the ellipsoid such that it is rotated by 45 degrees on the X axis. Finally, the ellipsoid is shifted by 2 units on the Y axis:
```
  ellipsoid = glm::translate(glm::mat4(1),glm::vec3(0,2,0));
  ellipsoid = glm::rotate(ellipsoid, 45.0f ,glm::vec3(1,0,0));
  ellipsoid = glm::scale(ellipsoid,   
   glm::vec3(fRadius,fRadius,fRadius/2));
  inverse_ellipsoid = glm::inverse(ellipsoid);
```
In the rendering function, bind the cloth deformation shader (Chapter8/ TransformFeedbackClothCollision/shaders/Spring.vert) and then run a loop. In each iteration, bind the texture buffers, and then bind the update vertex array object. At the same time, bind the previous buffer objects as the transform feedback buffers. Do the ping pong strategy as in the previous recipe.

After the loop is terminated, bind the render VAO and render the cloth:

   glBindVertexArray(vaoRenderID[writeID]);
   glDisable(GL_DEPTH_TEST);
   renderShader.Use();
   glUniformMatrix4fv(renderShader("MVP"), 1, GL_FALSE,    
   glm::value_ptr(mMVP));
   glDrawElements(GL_TRIANGLES, indices.size(), 
   GL_UNSIGNED_SHORT,0);
   renderShader.UnUse();
   glEnable(GL_DEPTH_TEST);
   if(bDisplayMasses) {
      particleShader.Use();
      glUniform1i(particleShader("selected_index"),   
      selected_index);
      glUniformMatrix4fv(particleShader("MV"), 1, GL_FALSE,    
      glm::value_ptr(mMV));  
      glUniformMatrix4fv(particleShader("MVP"), 1, GL_FALSE, 
      glm::value_ptr(mMVP));
      glDrawArrays(GL_POINTS, 0, total_points);    
      particleShader.UnUse(); }
   glBindVertexArray( 0);

In the vertex shader, obtain the current and previous position of the cloth vertex. If the vertex is a pinned vertex, set its mass to 0 so it would not be simulated, otherwise, add an external force based on gravity. Next, loop through all neighbors of the current vertex by looking up the texture buffer and estimate the internal force:

   float m = position_mass.w;
   vec3 pos = position_mass.xyz;
   vec3 pos_old = prev_position.xyz;  
   vec3 vel = (pos - pos_old) / dt;
   float ks=0, kd=0;
   int index = gl_VertexID;
   int ix = index % texsize_x;
   int iy = index / texsize_x;
   if(index ==0 || index == (texsize_x-1))   
      m = 0;
   vec3 F = gravity*m + (DEFAULT_DAMPING*vel);
   for(int k=0;k<12;k++) {
      ivec2 coord = getNextNeighbor(k, ks, kd);
      int j = coord.x;
      int i = coord.y;    
      if (((iy + i) < 0) || ((iy + i) > (texsize_y-1)))
         continue;
      if (((ix + j) < 0) || ((ix + j) > (texsize_x-1)))
         continue;
      int index_neigh = (iy + i) * texsize_x + ix + j;
      vec3 p2 = texelFetchBuffer(tex_position_mass,   
      index_neigh).xyz;
      vec3 p2_last = texelFetchBuffer(tex_prev_position_mass,  
      index_neigh).xyz;
      vec2 coord_neigh = vec2(ix + j, iy + i)*step;
      float rest_length = length(coord*inv_cloth_size);
      vec3 v2 = (p2- p2_last)/dt;
      vec3 deltaP = pos - p2;  
      vec3 deltaV = vel - v2;   
      float dist = length(deltaP);
      float   leftTerm = -ks * (dist-rest_length);
      float  rightTerm = kd * (dot(deltaV, deltaP)/dist);    
      vec3 springForce = (leftTerm + rightTerm)* 
      normalize(deltaP);
      F +=  springForce;  
   }

Using the combined force, calculate the acceleration and then estimate the new position using Verlet integration. Output the appropriate attribute from the shader:

   vec3 acc = vec3(0);
  if(m!=0)
     acc = F/m; 
  vec3 tmp = pos; 
  pos = pos * 2.0 - pos_old + acc* dt * dt;
  pos_old = tmp; 
  pos.y=max(0, pos.y);

After applying the floor collision, check for collision with an ellipsoid. If there is a collision, modify the position such that the collision is resolved. Finally, output the appropriate attributes from the vertex shader.

  vec4 x0 = inv_ellipsoid*vec4(pos,1); 
  vec3 delta0 = x0.xyz-ellipsoid.xyz;
  float dist2 = dot(delta0, delta0);
  if(dist2<1) {  
    delta0 = (ellipsoid.w - dist2) * delta0 / dist2; 
    vec3 delta;
    vec3 transformInv = vec3(ellipsoid_xform[0].x, 
    ellipsoid_xform[1].x, 
    ellipsoid_xform[2].x);
    transformInv /= dot(transformInv, transformInv);
    delta.x = dot(delta0, transformInv);
    transformInv = vec3(ellipsoid_xform[0].y, 
    ellipsoid_xform[1].y, 
    ellipsoid_xform[2].y);
    transformInv /= dot(transformInv, transformInv);
    delta.y = dot(delta0, transformInv);
    transformInv = vec3(ellipsoid_xform[0].z,     
    ellipsoid_xform[1].z, ellipsoid_xform[2].z);
    transformInv /= dot(transformInv, transformInv);
    delta.z = dot(delta0, transformInv); 
    pos +=  delta ; 
    pos_old = pos;  
  }
  out_position_mass = vec4(pos, m);  
  out_prev_position = vec4(pos_old,m);        
  gl_Position = MVP*vec4(pos, 1);

How it works…

The cloth deformation vertex shader has some additional lines of code to enable collision detection and response. For detection of collision with a plane, we can simply put the current position in the plane equation to find the distance of the current vertex from the plane. If it is less than 0, we have passed through the plane, in which case, we can move the vertex back in the plane's normal direction.

void planeCollision(inout vec3 x,  vec4 plane) {
   float dist = dot(plane.xyz,x)+ plane.w;
   if(dist<0) {
     x += plane.xyz*-dist;
   }
}

Simple geometric primitive, like spheres and ellipsoids, are trivial to handle. In case of collision with the sphere, we check the distance of the current position from the center of the sphere. If this distance is less than the sphere's radius, we have a collision. Once we have a collision, we push the position in the normal direction based on the amount of penetration.

void sphereCollision(inout vec3 x, vec4 sphere)
{
   vec3 delta = x - sphere.xyz;
   float dist = length(delta);
   if (dist < sphere.w) {
       x = sphere.xyz + delta*(sphere.w / dist);
   }
}

Tip

Note that in the preceding calculation, we can avoid the square root altogether by comparing against the squared distance. This can provide significant performance gain when a large number of vertices are there.

For an arbitrarily oriented ellipsoid, we first move the point into the ellipsoid's object space by multiplying with the inverse of the ellipsoid's transform. In this space, the ellipsoid is a unit sphere, hence we can then determine collision by simply looking at the distance between the current vertex and the ellipsoid. If it is less than 1, we have a collision. In this case, we then transform the point to the ellipsoids world space to find the penetration depth. This is then used to displace the current position out in the normal direction.

vec4 x0 = inv_ellipsoid*vec4(pos,1); 
vec3 delta0 = x0.xyz-ellipsoid.xyz;
float dist2 = dot(delta0, delta0);
if(dist2<1) {  
delta0 = (ellipsoid.w - dist2) * delta0 / dist2;
vec3 delta;
vec3 transformInv = vec3(ellipsoid_xform[0].x, ellipsoid_xform[1].x, ellipsoid_xform[2].x);
transformInv /= dot(transformInv, transformInv);
delta.x = dot(delta0, transformInv);
transformInv = vec3(ellipsoid_xform[0].y, ellipsoid_xform[1].y, ellipsoid_xform[2].y);
transformInv /= dot(transformInv, transformInv);
delta.y = dot(delta0, transformInv);
transformInv = vec3(ellipsoid_xform[0].z, ellipsoid_xform[1].z, ellipsoid_xform[2].z);
transformInv /= dot(transformInv, transformInv);
delta.z = dot(delta0, transformInv); 
pos +=  delta ; 
pos_old = pos;  
}

There's more…

The demo application implementing this recipe renders a piece of cloth fixed at two points and is allowed to fall under gravity. In addition, there is an oriented ellipsoid with which the cloth collides as shown in the following figure:

Although we have touched upon basic collision primitives, like spheres, oriented ellipsoids, and plane, more complex primitives can be implemented with the combination of these basic primitives. In addition, polygonal primitives can also be implemented. We leave that as an exercise for the reader.

How it works…

The cloth deformation vertex shader has some additional lines of code to enable collision detection and response. For detection of collision with a plane, we can simply put the current position in the plane equation to find the distance of the current vertex from the plane. If it is less than 0, we have passed through the plane, in which case, we can move the vertex back in the plane's normal direction.

void planeCollision(inout vec3 x,  vec4 plane) {
   float dist = dot(plane.xyz,x)+ plane.w;
   if(dist<0) {
     x += plane.xyz*-dist;
   }
}

Simple geometric primitive, like spheres and ellipsoids, are trivial to handle. In case of collision with the sphere, we check the distance of the current position from the center of the sphere. If this distance is less than the sphere's radius, we have a collision. Once we have a collision, we push the position in the normal direction based on the amount of penetration.

void sphereCollision(inout vec3 x, vec4 sphere)
{
   vec3 delta = x - sphere.xyz;
   float dist = length(delta);
   if (dist < sphere.w) {
       x = sphere.xyz + delta*(sphere.w / dist);
   }
}

Tip

Note that in the preceding calculation, we can avoid the square root altogether by comparing against the squared distance. This can provide significant performance gain when a large number of vertices are there.

For an arbitrarily oriented ellipsoid, we first move the point into the ellipsoid's object space by multiplying with the inverse of the ellipsoid's transform. In this space, the ellipsoid is a unit sphere, hence we can then determine collision by simply looking at the distance between the current vertex and the ellipsoid. If it is less than 1, we have a collision. In this case, we then transform the point to the ellipsoids world space to find the penetration depth. This is then used to displace the current position out in the normal direction.

vec4 x0 = inv_ellipsoid*vec4(pos,1); 
vec3 delta0 = x0.xyz-ellipsoid.xyz;
float dist2 = dot(delta0, delta0);
if(dist2<1) {  
delta0 = (ellipsoid.w - dist2) * delta0 / dist2;
vec3 delta;
vec3 transformInv = vec3(ellipsoid_xform[0].x, ellipsoid_xform[1].x, ellipsoid_xform[2].x);
transformInv /= dot(transformInv, transformInv);
delta.x = dot(delta0, transformInv);
transformInv = vec3(ellipsoid_xform[0].y, ellipsoid_xform[1].y, ellipsoid_xform[2].y);
transformInv /= dot(transformInv, transformInv);
delta.y = dot(delta0, transformInv);
transformInv = vec3(ellipsoid_xform[0].z, ellipsoid_xform[1].z, ellipsoid_xform[2].z);
transformInv /= dot(transformInv, transformInv);
delta.z = dot(delta0, transformInv); 
pos +=  delta ; 
pos_old = pos;  
}

There's more…

The demo application implementing this recipe renders a piece of cloth fixed at two points and is allowed to fall under gravity. In addition, there is an oriented ellipsoid with which the cloth collides as shown in the following figure:

Although we have touched upon basic collision primitives, like spheres, oriented ellipsoids, and plane, more complex primitives can be implemented with the combination of these basic primitives. In addition, polygonal primitives can also be implemented. We leave that as an exercise for the reader.

There's more…

The demo application implementing this recipe renders a piece of cloth fixed at two points and is allowed to fall under gravity. In addition, there is an oriented ellipsoid with which the cloth collides as shown in the following figure:

Although we have touched upon basic collision primitives, like spheres, oriented ellipsoids, and plane, more complex primitives can be implemented with the combination of these basic primitives. In addition, polygonal primitives can also be implemented. We leave that as an exercise for the reader.

Implementing a particle system using transform feedback

In this recipe, we will implement a simple particle system using the transform feedback mechanism. In this mode, the GPU bypasses the rasterizer and, later, the programmable graphics pipeline stages to feedback result to the vertex shader. The benefit from this mode is that using this feature, we can implement a physically-based simulation entirely on the GPU.

Getting ready

The code for this recipe is contained in the Chapter8/TransformFeedbackParticles directory.

How to do it…

Let us start this recipe by following these simple steps:

Set up two vertex array pairs: one for update and another for rendering. Bind two vertex buffer objects to each of the pairs, as was done in the previous two recipes. Here, the buffer objects will store the per-particle properties. Also, enable the corresponding vertex attributes:

  glGenVertexArrays(2, vaoUpdateID);
  glGenVertexArrays(2, vaoRenderID);
  glGenBuffers( 2, vboID_Pos);
  glGenBuffers( 2, vboID_PrePos);  
  glGenBuffers( 2, vboID_Direction);
  for(int i=0;i<2;i++) {
    glBindVertexArray(vaoUpdateID[i]);
    glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
    glBufferData( GL_ARRAY_BUFFER, TOTAL_PARTICLES* 
      sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);  
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
    glBindBuffer( GL_ARRAY_BUFFER, vboID_PrePos[i]);
    glBufferData( GL_ARRAY_BUFFER, TOTAL_PARTICLES* 
    sizeof(glm::vec4), 0, GL_DYNAMIC_COPY); 
    glEnableVertexAttribArray(1);
    glVertexAttribPointer(1,  4, GL_FLOAT, GL_FALSE, 0,0);    
    glBindBuffer( GL_ARRAY_BUFFER, vboID_Direction[i]);
    glBufferData( GL_ARRAY_BUFFER, TOTAL_PARTICLES* 
    sizeof(glm::vec4), 0, GL_DYNAMIC_COPY); 
    glEnableVertexAttribArray(2);
   glVertexAttribPointer(2,  4, GL_FLOAT, GL_FALSE, 0,0);     
   }
   for(int i=0;i<2;i++) {
      glBindVertexArray(vaoRenderID[i]);
      glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
      glEnableVertexAttribArray(0);
      glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
   }

Generate a transform feedback object and bind it. Next, specify the output attributes from the shader that would be stored in the transform feedback buffer. After this step, relink the shader program:

glGenTransformFeedbacks(1, &tfID);
glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, tfID); 
const char* varying_names[]={"out_position", 
"out_prev_position", "out_direction"};  
glTransformFeedbackVaryings(particleShader.GetProgram(), 3, 
varying_names, GL_SEPARATE_ATTRIBS);    
glLinkProgram(particleShader.GetProgram());

In the update function,
bind the particle vertex shader that will output to the transform feedback buffer and set the appropriate uniforms and update vertex array object. Note that to enable read/write access, we use a pair of vertex array objects such that we can read from one and write to another:
```
   particleShader.Use(); 
   glUniformMatrix4fv(particleShader("MVP"), 1, GL_FALSE,    
   glm::value_ptr(mMVP));    
   glUniform1f(particleShader("time"), t); 
   for(int i=0;i<NUM_ITER;i++) { 
      glBindVertexArray( vaoUpdateID[readID]);
```

Bind the vertex buffer objects that will store the outputs from the transform feedback step using the output attributes from the vertex shader:

glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, vboID_Pos[writeID]);
glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1, vboID_PrePos[writeID]);
glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 2, vboID_Direction[writeID]);

Disable the rasterizer to prevent the execution of the later stages of the pipeline and then begin the transform feedback. Next, issue a call to the glDrawArrays function to allow the vertices to be passed to the graphics pipeline. After this step, end the transform feedback and then enable the rasterizer. Note that to correctly determine the amount of execution time needed, we issue a hardware query. Next, we alternate the read/write paths by swapping the read and write IDs:
```
   glEnable(GL_RASTERIZER_DISCARD);   
   glBeginQuery(GL_TIME_ELAPSED,t_query); 
   glBeginTransformFeedback(GL_POINTS);               
   glDrawArrays(GL_POINTS, 0, TOTAL_PARTICLES);              
   glEndTransformFeedback();
   glEndQuery(GL_TIME_ELAPSED); 
   glFlush();
   glDisable(GL_RASTERIZER_DISCARD);
   int tmp = readID;
   readID=writeID;
   writeID = tmp;
```

Render the particles using the render shader. First, bind the render vertex array object and then draw the points using the glDrawArrays function:

   glBindVertexArray(vaoRenderID[readID]);
   renderShader.Use();   
   glUniformMatrix4fv(renderShader("MVP"), 1, GL_FALSE,  
   glm::value_ptr(mMVP));             
   glDrawArrays(GL_POINTS, 0, TOTAL_PARTICLES);            
   renderShader.UnUse();  
   glBindVertexArray(0);

In the particle vertex shader, check if the particle's life is greater than 0. If so, move the particle and reduce the life. Otherwise, calculate a new random direction and reset the life to spawn the particle from the origin. Refer to Chapter8/TransformFeedbackParticles/shaders/Particle.vert for details. After this step, output the appropriate values to the output attributes. The particle vertex shader is defined as follows:

#version 330 core
precision highp float;
#extension EXT_gpu_shader4 : require
layout( location = 0 )  in vec4 position;            
layout( location = 1 )  in vec4 prev_position;       
layout( location = 2 )  in vec4 direction; 
uniform mat4 MVP;  
uniform float time; 
const float PI = 3.14159;
const float TWO_PI = 2*PI;
const float PI_BY_2 = PI*0.5;
const float PI_BY_4 = PI_BY_2*0.5;
 
//shader outputs
out vec4 out_position;
out  vec4 out_prev_position;
out vec4 out_direction;

const float DAMPING_COEFFICIENT =  0.9995;
const vec3 emitterForce = vec3(0.0f,-0.001f, 0.0f);
const vec4 collidor = vec4(0,1,0,0);
const vec3 emitterPos = vec3(0);

float emitterYaw = (0.0f);
float emitterYawVar  = TWO_PI;
float emitterPitch  = PI_BY_2;
float emitterPitchVar = PI_BY_4;
float emitterSpeed = 0.05f;
float emitterSpeedVar = 0.01f;
 
int    emitterLife = 60;
int   emitterLifeVar = 15;
 
const float UINT_MAX = 4294967295.0;

void main() {
      vec3 prevPos = prev_position.xyz;
      int life = int(prev_position.w);
      vec3 pos  = position.xyz;
      float speed = position.w;
      vec3 dir = direction.xyz; 
      if(life > 0) {
         prevPos = pos;
         pos += dir*speed;
         if(dot(pos+emitterPos, collidor.xyz)+ collidor.w <0) {
         dir = reflect(dir, collidor.xyz);
          speed *= DAMPING_COEFFICIENT;
        }  
      dir += emitterForce;
      life--;
}  else {
uint seed =   uint(time + gl_VertexID); 
life = emitterLife + int(randhashf(seed++, 
emitterLifeVar));    
float yaw = emitterYaw + (randhashf(seed++,  
emitterYawVar ));
float pitch = emitterPitch + randhashf(seed++,    
emitterPitchVar);
RotationToDirection(pitch, yaw, dir);     
float nspeed = emitterSpeed + (randhashf(seed++,    
emitterSpeedVar ));
dir *= nspeed;  
pos = emitterPos;
prevPos = emitterPos; 
speed = 1; 
} 
out_position = vec4(pos, speed);  
out_prev_position = vec4(prevPos, life);    
out_direction = vec4(dir, 0);
gl_Position = MVP*vec4(pos, 1);
}

The three helper functions randhash, randhashf, and RotationToDirection are defined as follows:

uint randhash(uint seed) {
    uint i=(seed^12345391u)*2654435769u;
    i^=(i<<6u)^(i>>26u);
    i*=2654435769u;
    i+=(i<<5u)^(i>>12u);
    return i;
}

float randhashf(uint seed, float b) {
    return float(b * randhash(seed)) / UINT_MAX;
}

void RotationToDirection(float pitch, float yaw, 
                         out vec3 direction) {
  direction.x = -sin(yaw) * cos(pitch);
  direction.y = sin(pitch);
  direction.z = cos(pitch) * cos(yaw);
}

How it works…

The transform feedback mechanism allows us to feedback one or more attributes from the vertex shader or geometry shader back to a buffer object. This feedback path could be used for implementing a physically-based simulation. This recipe uses this mechanism to output the particle position after each iteration. After each step, the buffers are swapped and, therefore, it can simulate the particle motion.

To make the particle system, we first set up three pairs of vertex buffer objects that store the per-particle attributes that we input to the vertex shader. These include the particle's position, previous position, life, direction, and speed. These are stored into separate buffer objects for convenience. We could have stored all of these attributes into a single interleaved buffer object. Since we output to the buffer object from our shader, we specify the buffer object usage as GL_DYNAMIC_COPY. Similarly, we set up a separate vertex array object for rendering the particles:

for(int i=0;i<2;i++) {
   glBindVertexArray(vaoUpdateID[i]);
   glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
   glBufferData( GL_ARRAY_BUFFER, TOTAL_PARTICLES*  
   sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);    
   glEnableVertexAttribArray(0);
   glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
   glBindBuffer( GL_ARRAY_BUFFER, vboID_PrePos[i]);
   glBufferData( GL_ARRAY_BUFFER,   
   TOTAL_PARTICLES*sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);  
   glEnableVertexAttribArray(1);
   glVertexAttribPointer(1,  4, GL_FLOAT, GL_FALSE, 0,0);  
   glBindBuffer( GL_ARRAY_BUFFER, vboID_Direction[i]);
   glBufferData( GL_ARRAY_BUFFER, 
   TOTAL_PARTICLES*sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);  
   glEnableVertexAttribArray(2);
   glVertexAttribPointer(2,  4, GL_FLOAT, GL_FALSE, 0,0);
}
for(int i=0;i<2;i++) {
   glBindVertexArray(vaoRenderID[i]);
   glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
   glEnableVertexAttribArray(0);
   glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
}

Next, we specify the shader output attributes that we would like to connect to the transform feedback buffers. We use three outputs, namely out_position, out_prev_position, and out_direction, which output the particle's current position, particle's previous position, and the particle's direction along with the particle's speed, current, and initial life, respectively. We specify that we would connect these to separate buffer objects:

glGenTransformFeedbacks(1, &tfID);
glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, tfID); 
const char* varying_names[]={"out_position", "out_prev_position", "out_direction"};  
glTransformFeedbackVaryings(particleShader.GetProgram(), 3, varying_names, GL_SEPARATE_ATTRIBS);    
glLinkProgram(particleShader.GetProgram());

One last step is the actual initialization of the transform feedback. We do so by first binding the particle vertex shader. Then, we pass the appropriate uniforms to the shader, which includes the combined modelview projection (MVP) matrix and the time (t):

particleShader.Use();
   glUniformMatrix4fv(particleShader("MVP"),1,GL_FALSE,
                      glm::value_ptr(mMVP));
   glUniform1f(particleShader("time"), t);

We then run a loop for the number of iterations desired. In the loop, we first bind the update vertex array object and assign the appropriate transform feedback buffer base indices:

for(int i=0;i<NUM_ITER;i++) {
   glBindVertexArray( vaoUpdateID[readID]);
   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0,     
   vboID_Pos[writeID]);
   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1, 
   vboID_PrePos[writeID]);
   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 2,  
   vboID_Direction[writeID]);

We then disable the rasterizer and begin the transform feedback. During this, we issue a glDrawArrays call to pass the vertices to the vertex shader. Next, we end the transform feedback and then restore the rasterizer. Finally, we swap the read/write pathways. In between, to estimate the amount of time needed for this process, we issue an OpenGL hardware timer query (GL_TIME_ELAPSED). This returns the total time in nanoseconds:

glEnable(GL_RASTERIZER_DISCARD);    // disable rasterization
  glBeginQuery(GL_TIME_ELAPSED,t_query);
    glBeginTransformFeedback(GL_POINTS);
      glDrawArrays(GL_POINTS, 0, TOTAL_PARTICLES);
    glEndTransformFeedback();
  glEndQuery(GL_TIME_ELAPSED);
  glFlush();
  glDisable(GL_RASTERIZER_DISCARD);
  int tmp = readID;
  readID=writeID;
  writeID = tmp;
}
// get the query result
glGetQueryObjectui64v(t_query, GL_QUERY_RESULT, &elapsed_time);
delta_time = elapsed_time / 1000000.0f; 
particleShader.UnUse();

The main work of particle simulation takes place in the vertex shader (Chapter8/TransformFeedbackParticles/shaders/Particle.vert). After storing the initial attributes, the particle's life is checked. If the value is greater than 0, we update the position of the particle using the current direction of the particle and its speed. Next, we then check the particle for collision with the colliding plane. If there is a collision, we deflect the particle using the reflect GLSL function, passing it the particle's current direction of motion and the normal of the collidor. We also reduce the speed on collision. We then increase the direction of the particle using the emitter's force. We then reduce the life of the particle:

if(life > 0) {
   prevPos = pos;
   pos += dir*speed; 
   if(dot(pos+emitterPos, collidor.xyz)+ collidor.w <0) {       
      dir = reflect(dir, collidor.xyz);
      speed *= DAMPING_COEFFICIENT;
   }  
   dir += emitterForce;
   life--;
}

If the life is less than 0, we reset the particle's direction of motion to a new random direction. We reset the life to a random value based on the maximum allowed value. The current and previous positions of the particle are reset to the emitter origin and finally, the speed is reset to the default value. We then output the output attributes:

else {
   uint seed =   uint(time + gl_VertexID); 
   life = emitterLife + int(randhashf(seed++, emitterLifeVar));    
   float yaw = emitterYaw + (randhashf(seed++, emitterYawVar ));
   float pitch=emitterPitch+randhashf(seed++,   emitterPitchVar);
   RotationToDirection(pitch, yaw, dir);     
   float nspeed = emitterSpeed + (randhashf(seed++, 
   emitterSpeedVar ));
   dir *= nspeed;  
   pos = emitterPos;
   prevPos = emitterPos; 
   speed = 1;
}  
out_position = vec4(pos, speed);  
out_prev_position = vec4(prevPos, life);    
out_direction = vec4(dir, 0);
gl_Position = MVP*vec4(pos, 1);

There's more…

The demo application for this recipe generates a simple particle system running entirely on the GPU using the transform feedback mechanism coupled with a vertex shader that writes to output attributes bound as transform feedback buffers. Running the demo application gives us the output as shown in the following figure:

Note that for this demo, we render the particles as points of size 10 units. We could easily change the rendering mode to point sprites with size modified in the vertex shader to give the particles a different look. Also, we can also change the colors and blending modes to achieve various effects. In addition, we could achieve the same result by using one vertex buffer pair with interleaved attributes or two separate transform feedback objects. All of these variants should be straightforward to implement by following the guidelines laid out in this recipe.

We had already looked at a simple approach of simulating GPU-based particle systems using vertex shader in Chapter 5, Mesh Model Formats and Particle Systems, we will now detail pros and cons of each. In Chapter 5 we presented a stateless particle system since all of the attributes (that is, position and velocity) were generated on the fly using the vertex ID, time, and basic kinematic equations on each particle vertex.

As the state of the particle is not stored, we cannot reproduce the same simulation every frame. Hence, collision detection and response are problematic, as we do not have any information of the previous state of the particle, which is often required for collision response. On the contrary, the particle simulation technique presented in this recipe uses a state-preserving particle system. We stored current and previous positions of each particle in buffer objects. In addition, we used transform feedback and a vertex shader for particle simulation on the GPU. As the state of the particle is stored, we can carry out collision detection and response easily.

How to do it…

Let us start this recipe by following these simple steps:

Set up two vertex array pairs: one for update and another for rendering. Bind two vertex buffer objects to each of the pairs, as was done in the previous two recipes. Here, the buffer objects will store the per-particle properties. Also, enable the corresponding vertex attributes:

  glGenVertexArrays(2, vaoUpdateID);
  glGenVertexArrays(2, vaoRenderID);
  glGenBuffers( 2, vboID_Pos);
  glGenBuffers( 2, vboID_PrePos);  
  glGenBuffers( 2, vboID_Direction);
  for(int i=0;i<2;i++) {
    glBindVertexArray(vaoUpdateID[i]);
    glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
    glBufferData( GL_ARRAY_BUFFER, TOTAL_PARTICLES* 
      sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);  
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
    glBindBuffer( GL_ARRAY_BUFFER, vboID_PrePos[i]);
    glBufferData( GL_ARRAY_BUFFER, TOTAL_PARTICLES* 
    sizeof(glm::vec4), 0, GL_DYNAMIC_COPY); 
    glEnableVertexAttribArray(1);
    glVertexAttribPointer(1,  4, GL_FLOAT, GL_FALSE, 0,0);    
    glBindBuffer( GL_ARRAY_BUFFER, vboID_Direction[i]);
    glBufferData( GL_ARRAY_BUFFER, TOTAL_PARTICLES* 
    sizeof(glm::vec4), 0, GL_DYNAMIC_COPY); 
    glEnableVertexAttribArray(2);
   glVertexAttribPointer(2,  4, GL_FLOAT, GL_FALSE, 0,0);     
   }
   for(int i=0;i<2;i++) {
      glBindVertexArray(vaoRenderID[i]);
      glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
      glEnableVertexAttribArray(0);
      glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
   }

Generate a transform feedback object and bind it. Next, specify the output attributes from the shader that would be stored in the transform feedback buffer. After this step, relink the shader program:

glGenTransformFeedbacks(1, &tfID);
glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, tfID); 
const char* varying_names[]={"out_position", 
"out_prev_position", "out_direction"};  
glTransformFeedbackVaryings(particleShader.GetProgram(), 3, 
varying_names, GL_SEPARATE_ATTRIBS);    
glLinkProgram(particleShader.GetProgram());

In the update function,
bind the particle vertex shader that will output to the transform feedback buffer and set the appropriate uniforms and update vertex array object. Note that to enable read/write access, we use a pair of vertex array objects such that we can read from one and write to another:
```
   particleShader.Use(); 
   glUniformMatrix4fv(particleShader("MVP"), 1, GL_FALSE,    
   glm::value_ptr(mMVP));    
   glUniform1f(particleShader("time"), t); 
   for(int i=0;i<NUM_ITER;i++) { 
      glBindVertexArray( vaoUpdateID[readID]);
```

Bind the vertex buffer objects that will store the outputs from the transform feedback step using the output attributes from the vertex shader:

glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, vboID_Pos[writeID]);
glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1, vboID_PrePos[writeID]);
glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 2, vboID_Direction[writeID]);

Disable the rasterizer to prevent the execution of the later stages of the pipeline and then begin the transform feedback. Next, issue a call to the glDrawArrays function to allow the vertices to be passed to the graphics pipeline. After this step, end the transform feedback and then enable the rasterizer. Note that to correctly determine the amount of execution time needed, we issue a hardware query. Next, we alternate the read/write paths by swapping the read and write IDs:
```
   glEnable(GL_RASTERIZER_DISCARD);   
   glBeginQuery(GL_TIME_ELAPSED,t_query); 
   glBeginTransformFeedback(GL_POINTS);               
   glDrawArrays(GL_POINTS, 0, TOTAL_PARTICLES);              
   glEndTransformFeedback();
   glEndQuery(GL_TIME_ELAPSED); 
   glFlush();
   glDisable(GL_RASTERIZER_DISCARD);
   int tmp = readID;
   readID=writeID;
   writeID = tmp;
```

Render the particles using the render shader. First, bind the render vertex array object and then draw the points using the glDrawArrays function:

   glBindVertexArray(vaoRenderID[readID]);
   renderShader.Use();   
   glUniformMatrix4fv(renderShader("MVP"), 1, GL_FALSE,  
   glm::value_ptr(mMVP));             
   glDrawArrays(GL_POINTS, 0, TOTAL_PARTICLES);            
   renderShader.UnUse();  
   glBindVertexArray(0);

In the particle vertex shader, check if the particle's life is greater than 0. If so, move the particle and reduce the life. Otherwise, calculate a new random direction and reset the life to spawn the particle from the origin. Refer to Chapter8/TransformFeedbackParticles/shaders/Particle.vert for details. After this step, output the appropriate values to the output attributes. The particle vertex shader is defined as follows:

#version 330 core
precision highp float;
#extension EXT_gpu_shader4 : require
layout( location = 0 )  in vec4 position;            
layout( location = 1 )  in vec4 prev_position;       
layout( location = 2 )  in vec4 direction; 
uniform mat4 MVP;  
uniform float time; 
const float PI = 3.14159;
const float TWO_PI = 2*PI;
const float PI_BY_2 = PI*0.5;
const float PI_BY_4 = PI_BY_2*0.5;
 
//shader outputs
out vec4 out_position;
out  vec4 out_prev_position;
out vec4 out_direction;

const float DAMPING_COEFFICIENT =  0.9995;
const vec3 emitterForce = vec3(0.0f,-0.001f, 0.0f);
const vec4 collidor = vec4(0,1,0,0);
const vec3 emitterPos = vec3(0);

float emitterYaw = (0.0f);
float emitterYawVar  = TWO_PI;
float emitterPitch  = PI_BY_2;
float emitterPitchVar = PI_BY_4;
float emitterSpeed = 0.05f;
float emitterSpeedVar = 0.01f;
 
int    emitterLife = 60;
int   emitterLifeVar = 15;
 
const float UINT_MAX = 4294967295.0;

void main() {
      vec3 prevPos = prev_position.xyz;
      int life = int(prev_position.w);
      vec3 pos  = position.xyz;
      float speed = position.w;
      vec3 dir = direction.xyz; 
      if(life > 0) {
         prevPos = pos;
         pos += dir*speed;
         if(dot(pos+emitterPos, collidor.xyz)+ collidor.w <0) {
         dir = reflect(dir, collidor.xyz);
          speed *= DAMPING_COEFFICIENT;
        }  
      dir += emitterForce;
      life--;
}  else {
uint seed =   uint(time + gl_VertexID); 
life = emitterLife + int(randhashf(seed++, 
emitterLifeVar));    
float yaw = emitterYaw + (randhashf(seed++,  
emitterYawVar ));
float pitch = emitterPitch + randhashf(seed++,    
emitterPitchVar);
RotationToDirection(pitch, yaw, dir);     
float nspeed = emitterSpeed + (randhashf(seed++,    
emitterSpeedVar ));
dir *= nspeed;  
pos = emitterPos;
prevPos = emitterPos; 
speed = 1; 
} 
out_position = vec4(pos, speed);  
out_prev_position = vec4(prevPos, life);    
out_direction = vec4(dir, 0);
gl_Position = MVP*vec4(pos, 1);
}

The three helper functions randhash, randhashf, and RotationToDirection are defined as follows:

uint randhash(uint seed) {
    uint i=(seed^12345391u)*2654435769u;
    i^=(i<<6u)^(i>>26u);
    i*=2654435769u;
    i+=(i<<5u)^(i>>12u);
    return i;
}

float randhashf(uint seed, float b) {
    return float(b * randhash(seed)) / UINT_MAX;
}

void RotationToDirection(float pitch, float yaw, 
                         out vec3 direction) {
  direction.x = -sin(yaw) * cos(pitch);
  direction.y = sin(pitch);
  direction.z = cos(pitch) * cos(yaw);
}

How it works…

The transform feedback mechanism allows us to feedback one or more attributes from the vertex shader or geometry shader back to a buffer object. This feedback path could be used for implementing a physically-based simulation. This recipe uses this mechanism to output the particle position after each iteration. After each step, the buffers are swapped and, therefore, it can simulate the particle motion.

To make the particle system, we first set up three pairs of vertex buffer objects that store the per-particle attributes that we input to the vertex shader. These include the particle's position, previous position, life, direction, and speed. These are stored into separate buffer objects for convenience. We could have stored all of these attributes into a single interleaved buffer object. Since we output to the buffer object from our shader, we specify the buffer object usage as GL_DYNAMIC_COPY. Similarly, we set up a separate vertex array object for rendering the particles:

for(int i=0;i<2;i++) {
   glBindVertexArray(vaoUpdateID[i]);
   glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
   glBufferData( GL_ARRAY_BUFFER, TOTAL_PARTICLES*  
   sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);    
   glEnableVertexAttribArray(0);
   glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
   glBindBuffer( GL_ARRAY_BUFFER, vboID_PrePos[i]);
   glBufferData( GL_ARRAY_BUFFER,   
   TOTAL_PARTICLES*sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);  
   glEnableVertexAttribArray(1);
   glVertexAttribPointer(1,  4, GL_FLOAT, GL_FALSE, 0,0);  
   glBindBuffer( GL_ARRAY_BUFFER, vboID_Direction[i]);
   glBufferData( GL_ARRAY_BUFFER, 
   TOTAL_PARTICLES*sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);  
   glEnableVertexAttribArray(2);
   glVertexAttribPointer(2,  4, GL_FLOAT, GL_FALSE, 0,0);
}
for(int i=0;i<2;i++) {
   glBindVertexArray(vaoRenderID[i]);
   glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
   glEnableVertexAttribArray(0);
   glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
}

Next, we specify the shader output attributes that we would like to connect to the transform feedback buffers. We use three outputs, namely out_position, out_prev_position, and out_direction, which output the particle's current position, particle's previous position, and the particle's direction along with the particle's speed, current, and initial life, respectively. We specify that we would connect these to separate buffer objects:

glGenTransformFeedbacks(1, &tfID);
glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, tfID); 
const char* varying_names[]={"out_position", "out_prev_position", "out_direction"};  
glTransformFeedbackVaryings(particleShader.GetProgram(), 3, varying_names, GL_SEPARATE_ATTRIBS);    
glLinkProgram(particleShader.GetProgram());

One last step is the actual initialization of the transform feedback. We do so by first binding the particle vertex shader. Then, we pass the appropriate uniforms to the shader, which includes the combined modelview projection (MVP) matrix and the time (t):

particleShader.Use();
   glUniformMatrix4fv(particleShader("MVP"),1,GL_FALSE,
                      glm::value_ptr(mMVP));
   glUniform1f(particleShader("time"), t);

We then run a loop for the number of iterations desired. In the loop, we first bind the update vertex array object and assign the appropriate transform feedback buffer base indices:

for(int i=0;i<NUM_ITER;i++) {
   glBindVertexArray( vaoUpdateID[readID]);
   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0,     
   vboID_Pos[writeID]);
   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1, 
   vboID_PrePos[writeID]);
   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 2,  
   vboID_Direction[writeID]);

We then disable the rasterizer and begin the transform feedback. During this, we issue a glDrawArrays call to pass the vertices to the vertex shader. Next, we end the transform feedback and then restore the rasterizer. Finally, we swap the read/write pathways. In between, to estimate the amount of time needed for this process, we issue an OpenGL hardware timer query (GL_TIME_ELAPSED). This returns the total time in nanoseconds:

glEnable(GL_RASTERIZER_DISCARD);    // disable rasterization
  glBeginQuery(GL_TIME_ELAPSED,t_query);
    glBeginTransformFeedback(GL_POINTS);
      glDrawArrays(GL_POINTS, 0, TOTAL_PARTICLES);
    glEndTransformFeedback();
  glEndQuery(GL_TIME_ELAPSED);
  glFlush();
  glDisable(GL_RASTERIZER_DISCARD);
  int tmp = readID;
  readID=writeID;
  writeID = tmp;
}
// get the query result
glGetQueryObjectui64v(t_query, GL_QUERY_RESULT, &elapsed_time);
delta_time = elapsed_time / 1000000.0f; 
particleShader.UnUse();

The main work of particle simulation takes place in the vertex shader (Chapter8/TransformFeedbackParticles/shaders/Particle.vert). After storing the initial attributes, the particle's life is checked. If the value is greater than 0, we update the position of the particle using the current direction of the particle and its speed. Next, we then check the particle for collision with the colliding plane. If there is a collision, we deflect the particle using the reflect GLSL function, passing it the particle's current direction of motion and the normal of the collidor. We also reduce the speed on collision. We then increase the direction of the particle using the emitter's force. We then reduce the life of the particle:

if(life > 0) {
   prevPos = pos;
   pos += dir*speed; 
   if(dot(pos+emitterPos, collidor.xyz)+ collidor.w <0) {       
      dir = reflect(dir, collidor.xyz);
      speed *= DAMPING_COEFFICIENT;
   }  
   dir += emitterForce;
   life--;
}

If the life is less than 0, we reset the particle's direction of motion to a new random direction. We reset the life to a random value based on the maximum allowed value. The current and previous positions of the particle are reset to the emitter origin and finally, the speed is reset to the default value. We then output the output attributes:

else {
   uint seed =   uint(time + gl_VertexID); 
   life = emitterLife + int(randhashf(seed++, emitterLifeVar));    
   float yaw = emitterYaw + (randhashf(seed++, emitterYawVar ));
   float pitch=emitterPitch+randhashf(seed++,   emitterPitchVar);
   RotationToDirection(pitch, yaw, dir);     
   float nspeed = emitterSpeed + (randhashf(seed++, 
   emitterSpeedVar ));
   dir *= nspeed;  
   pos = emitterPos;
   prevPos = emitterPos; 
   speed = 1;
}  
out_position = vec4(pos, speed);  
out_prev_position = vec4(prevPos, life);    
out_direction = vec4(dir, 0);
gl_Position = MVP*vec4(pos, 1);

There's more…

The demo application for this recipe generates a simple particle system running entirely on the GPU using the transform feedback mechanism coupled with a vertex shader that writes to output attributes bound as transform feedback buffers. Running the demo application gives us the output as shown in the following figure:

Note that for this demo, we render the particles as points of size 10 units. We could easily change the rendering mode to point sprites with size modified in the vertex shader to give the particles a different look. Also, we can also change the colors and blending modes to achieve various effects. In addition, we could achieve the same result by using one vertex buffer pair with interleaved attributes or two separate transform feedback objects. All of these variants should be straightforward to implement by following the guidelines laid out in this recipe.

We had already looked at a simple approach of simulating GPU-based particle systems using vertex shader in Chapter 5, Mesh Model Formats and Particle Systems, we will now detail pros and cons of each. In Chapter 5 we presented a stateless particle system since all of the attributes (that is, position and velocity) were generated on the fly using the vertex ID, time, and basic kinematic equations on each particle vertex.

As the state of the particle is not stored, we cannot reproduce the same simulation every frame. Hence, collision detection and response are problematic, as we do not have any information of the previous state of the particle, which is often required for collision response. On the contrary, the particle simulation technique presented in this recipe uses a state-preserving particle system. We stored current and previous positions of each particle in buffer objects. In addition, we used transform feedback and a vertex shader for particle simulation on the GPU. As the state of the particle is stored, we can carry out collision detection and response easily.

How it works…

The transform feedback mechanism allows us to feedback one or more attributes from the vertex shader or geometry shader back to a buffer object. This feedback path could be used for implementing a physically-based simulation. This recipe uses this mechanism to output the particle position after each iteration. After each step, the buffers are swapped and, therefore, it can simulate the particle motion.

To make the particle system, we first set up three pairs of vertex buffer objects that store the per-particle attributes that we input to the vertex shader. These include the particle's position, previous position, life, direction, and speed. These are stored into separate buffer objects for convenience. We could have stored all of these attributes into a single interleaved buffer object. Since we output to the buffer object from our shader, we specify the buffer object usage as GL_DYNAMIC_COPY. Similarly, we set up a separate vertex array object for rendering the particles:

for(int i=0;i<2;i++) {
   glBindVertexArray(vaoUpdateID[i]);
   glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
   glBufferData( GL_ARRAY_BUFFER, TOTAL_PARTICLES*  
   sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);    
   glEnableVertexAttribArray(0);
   glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
   glBindBuffer( GL_ARRAY_BUFFER, vboID_PrePos[i]);
   glBufferData( GL_ARRAY_BUFFER,   
   TOTAL_PARTICLES*sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);  
   glEnableVertexAttribArray(1);
   glVertexAttribPointer(1,  4, GL_FLOAT, GL_FALSE, 0,0);  
   glBindBuffer( GL_ARRAY_BUFFER, vboID_Direction[i]);
   glBufferData( GL_ARRAY_BUFFER, 
   TOTAL_PARTICLES*sizeof(glm::vec4), 0, GL_DYNAMIC_COPY);  
   glEnableVertexAttribArray(2);
   glVertexAttribPointer(2,  4, GL_FLOAT, GL_FALSE, 0,0);
}
for(int i=0;i<2;i++) {
   glBindVertexArray(vaoRenderID[i]);
   glBindBuffer( GL_ARRAY_BUFFER, vboID_Pos[i]);
   glEnableVertexAttribArray(0);
   glVertexAttribPointer(0,  4, GL_FLOAT, GL_FALSE, 0, 0);
}

Next, we specify the shader output attributes that we would like to connect to the transform feedback buffers. We use three outputs, namely out_position, out_prev_position, and out_direction, which output the particle's current position, particle's previous position, and the particle's direction along with the particle's speed, current, and initial life, respectively. We specify that we would connect these to separate buffer objects:

glGenTransformFeedbacks(1, &tfID);
glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, tfID); 
const char* varying_names[]={"out_position", "out_prev_position", "out_direction"};  
glTransformFeedbackVaryings(particleShader.GetProgram(), 3, varying_names, GL_SEPARATE_ATTRIBS);    
glLinkProgram(particleShader.GetProgram());

One last step is the actual initialization of the transform feedback. We do so by first binding the particle vertex shader. Then, we pass the appropriate uniforms to the shader, which includes the combined modelview projection (MVP) matrix and the time (t):

particleShader.Use();
   glUniformMatrix4fv(particleShader("MVP"),1,GL_FALSE,
                      glm::value_ptr(mMVP));
   glUniform1f(particleShader("time"), t);

We then run a loop for the number of iterations desired. In the loop, we first bind the update vertex array object and assign the appropriate transform feedback buffer base indices:

for(int i=0;i<NUM_ITER;i++) {
   glBindVertexArray( vaoUpdateID[readID]);
   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0,     
   vboID_Pos[writeID]);
   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 1, 
   vboID_PrePos[writeID]);
   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 2,  
   vboID_Direction[writeID]);

We then disable the rasterizer and begin the transform feedback. During this, we issue a glDrawArrays call to pass the vertices to the vertex shader. Next, we end the transform feedback and then restore the rasterizer. Finally, we swap the read/write pathways. In between, to estimate the amount of time needed for this process, we issue an OpenGL hardware timer query (GL_TIME_ELAPSED). This returns the total time in nanoseconds:

glEnable(GL_RASTERIZER_DISCARD);    // disable rasterization
  glBeginQuery(GL_TIME_ELAPSED,t_query);
    glBeginTransformFeedback(GL_POINTS);
      glDrawArrays(GL_POINTS, 0, TOTAL_PARTICLES);
    glEndTransformFeedback();
  glEndQuery(GL_TIME_ELAPSED);
  glFlush();
  glDisable(GL_RASTERIZER_DISCARD);
  int tmp = readID;
  readID=writeID;
  writeID = tmp;
}
// get the query result
glGetQueryObjectui64v(t_query, GL_QUERY_RESULT, &elapsed_time);
delta_time = elapsed_time / 1000000.0f; 
particleShader.UnUse();

The main work of particle simulation takes place in the vertex shader (Chapter8/TransformFeedbackParticles/shaders/Particle.vert). After storing the initial attributes, the particle's life is checked. If the value is greater than 0, we update the position of the particle using the current direction of the particle and its speed. Next, we then check the particle for collision with the colliding plane. If there is a collision, we deflect the particle using the reflect GLSL function, passing it the particle's current direction of motion and the normal of the collidor. We also reduce the speed on collision. We then increase the direction of the particle using the emitter's force. We then reduce the life of the particle:

if(life > 0) {
   prevPos = pos;
   pos += dir*speed; 
   if(dot(pos+emitterPos, collidor.xyz)+ collidor.w <0) {       
      dir = reflect(dir, collidor.xyz);
      speed *= DAMPING_COEFFICIENT;
   }  
   dir += emitterForce;
   life--;
}

If the life is less than 0, we reset the particle's direction of motion to a new random direction. We reset the life to a random value based on the maximum allowed value. The current and previous positions of the particle are reset to the emitter origin and finally, the speed is reset to the default value. We then output the output attributes:

else {
   uint seed =   uint(time + gl_VertexID); 
   life = emitterLife + int(randhashf(seed++, emitterLifeVar));    
   float yaw = emitterYaw + (randhashf(seed++, emitterYawVar ));
   float pitch=emitterPitch+randhashf(seed++,   emitterPitchVar);
   RotationToDirection(pitch, yaw, dir);     
   float nspeed = emitterSpeed + (randhashf(seed++, 
   emitterSpeedVar ));
   dir *= nspeed;  
   pos = emitterPos;
   prevPos = emitterPos; 
   speed = 1;
}  
out_position = vec4(pos, speed);  
out_prev_position = vec4(prevPos, life);    
out_direction = vec4(dir, 0);
gl_Position = MVP*vec4(pos, 1);

There's more…

The demo application for this recipe generates a simple particle system running entirely on the GPU using the transform feedback mechanism coupled with a vertex shader that writes to output attributes bound as transform feedback buffers. Running the demo application gives us the output as shown in the following figure:

Note that for this demo, we render the particles as points of size 10 units. We could easily change the rendering mode to point sprites with size modified in the vertex shader to give the particles a different look. Also, we can also change the colors and blending modes to achieve various effects. In addition, we could achieve the same result by using one vertex buffer pair with interleaved attributes or two separate transform feedback objects. All of these variants should be straightforward to implement by following the guidelines laid out in this recipe.

We had already looked at a simple approach of simulating GPU-based particle systems using vertex shader in Chapter 5, Mesh Model Formats and Particle Systems, we will now detail pros and cons of each. In Chapter 5 we presented a stateless particle system since all of the attributes (that is, position and velocity) were generated on the fly using the vertex ID, time, and basic kinematic equations on each particle vertex.

As the state of the particle is not stored, we cannot reproduce the same simulation every frame. Hence, collision detection and response are problematic, as we do not have any information of the previous state of the particle, which is often required for collision response. On the contrary, the particle simulation technique presented in this recipe uses a state-preserving particle system. We stored current and previous positions of each particle in buffer objects. In addition, we used transform feedback and a vertex shader for particle simulation on the GPU. As the state of the particle is stored, we can carry out collision detection and response easily.

There's more…

The demo application for this recipe generates a simple particle system running entirely on the GPU using the transform feedback mechanism coupled with a vertex shader that writes to output attributes bound as transform feedback buffers. Running the demo application gives us the output as shown in the following figure:

Note that for this demo, we render the particles as points of size 10 units. We could easily change the rendering mode to point sprites with size modified in the vertex shader to give the particles a different look. Also, we can also change the colors and blending modes to achieve various effects. In addition, we could achieve the same result by using one vertex buffer pair with interleaved attributes or two separate transform feedback objects. All of these variants should be straightforward to implement by following the guidelines laid out in this recipe.

We had already looked at a simple approach of simulating GPU-based particle systems using vertex shader in Chapter 5, Mesh Model Formats and Particle Systems, we will now detail pros and cons of each. In Chapter 5 we presented a stateless particle system since all of the attributes (that is, position and velocity) were generated on the fly using the vertex ID, time, and basic kinematic equations on each particle vertex.

As the state of the particle is not stored, we cannot reproduce the same simulation every frame. Hence, collision detection and response are problematic, as we do not have any information of the previous state of the particle, which is often required for collision response. On the contrary, the particle simulation technique presented in this recipe uses a state-preserving particle system. We stored current and previous positions of each particle in buffer objects. In addition, we used transform feedback and a vertex shader for particle simulation on the GPU. As the state of the particle is stored, we can carry out collision detection and response easily.

You're reading from OpenGL ??? Build high performance graphics Assimilate the ideas shared in the course to utilize the power of OpenGL to perform a wide variety of tasks.

Table of Contents (5) Chapters

Chapter 8. Skeletal and Physically-based Simulation on the GPU

Introduction

Implementing skeletal animation using matrix palette skinning

Getting ready

How to do it…

How it works…

Tip

Tip

Tip

There's more…

See also

How to do it…

How it works…

Tip

Tip

Tip

There's more…

See also

How it works…

Tip

Tip

Tip

There's more…

See also

Tip

Tip

Tip

There's more…

See also

See also

Implementing skeletal animation using dual quaternion skinning

Getting ready

How to do it…

How it works…

There's more…

See also

How to do it…

How it works…

There's more…

See also

How it works…

There's more…

See also

There's more…

See also

See also

Modeling cloth using transform feedback

Tip

Getting ready

How to do it…

How it works…

There's more…

See also

How to do it…

How it works…

There's more…

See also

How it works…

There's more…

See also

There's more…

See also

See also

Implementing collision detection and response on a transform feedback-based cloth model

Getting ready

How to do it…

How it works…

Tip

There's more…

See also

How to do it…

How it works…

Tip

There's more…

See also

How it works…

Tip

There's more…