Texture memory is another read-only memory that can accelerate the program and reduce memory bandwidth when data is read in a certain pattern. Like constant memory, it is also cached on a chip. This memory was originally designed for rendering graphics, but it can also be used for general purpose computing applications. It is very effective when applications have memory access that exhibits a great deal of spatial locality. The meaning of spatial locality is that each thread is likely to read from the nearby location what other nearby threads read. This is great in image processing applications where we work on 4-point connectivity and 8-point connectivity. A two-dimensional spatial locality for accessing memory location by threads may look something like this:
Thread 0 | Thread 2 |
Thread 1 |
Thread 3 |
General global memory cache will not be able to capture...