In the last chapter, we saw how easy it is to install CUDA and write a program using it. Though the example was not impressive, it was shown to convince you that it is very easy to get started with CUDA. In this chapter, we will build upon this concept. It teaches you to write advance programs using CUDA for GPUs in detail. It starts with a variable addition program and then incrementally builds towards complex vector manipulation examples in CUDA C. It also covers how the kernel works and how to use device properties in CUDA programs. The chapter discusses how vectors are operated upon in CUDA programs and how CUDA can accelerate vector operations compared to CPU processing. It also discusses terminologies associated with CUDA programming.
The following topics will be covered in this chapter:
- The concept of the kernel call
- Creating kernel functions and passing parameters to it in CUDA
- Configuring kernel parameters and memory allocation for CUDA programs
- Thread execution in CUDA programs
- Accessing GPU device properties from CUDA programs
- Working with vectors in CUDA programs
- Parallel communication patterns