At this point it is necessary to introduce some basic concepts to understand the CUDA programming model. The first distinction is between host and device.
The code executed in the host side is the part of code executed on the CPU, and this will also include the RAM and the hard disk.
However, the code executed on the device is automatically loaded on the graphic card and run on the latter. Another important concept is the kernel; it stands for a function performed on the device and launched from the host.
The code defined in the kernel will be performed in parallel by an array of threads. The following figure summarizes how the GPU programming model works:
- The running program will have source code to run on CPU and code to run on GPU
- CPU and GPU have separated memories
- The data is transferred from CPU to GPU to be computed
- The data output from GPU computation is copied back to CPU memory