By now, you must be aware of the computational advantages of CUDA C/C++ as per our earlier discussions. C/C++ coupled with CUDA allows you to modify parts of your source code to accelerate your computational results. The primary steps necessary for implementing CUDA code will be explored through a GPU program.
Please manually type in the code used in this book on your IDE from this point onward. Directly copying and pasting from the PDF will ruin the indentations in the code and make it unready to deploy.
First, let's look into the following conventional C++ program that multiplies two array elements using double precision. We'll run the kernel on 500 million elements on the CPU. All the elements of the p and q arrays are set to 24 and 12 respectively.
The following is the C++ program we've just described ...