In the last chapter, we looked at memory architecture in CUDA and saw how it can be used efficiently to accelerate applications. Up until now, we have not seen a method to measure the performance of CUDA programs. In this chapter, we will discuss how we can do that using CUDA events. The Nvidia Visual Profiler will also be discussed, as well as how to resolve errors in CUDA programs from within the CUDA code and using debugging tools. How we can improve the performance of CUDA programs will also be discussed. This chapter will describe how CUDA streams can be used for multitasking and how we can use them to accelerate applications. You will also learn how array-sorting algorithms can be accelerated using CUDA. Image processing is an application where we need to process a large amount of data in a very small amount of time, so CUDA can be an ideal choice...
United States
United Kingdom
India
Germany
France
Canada
Russia
Spain
Brazil
Australia
Argentina
Austria
Belgium
Bulgaria
Chile
Colombia
Cyprus
Czechia
Denmark
Ecuador
Egypt
Estonia
Finland
Greece
Hungary
Indonesia
Ireland
Italy
Japan
Latvia
Lithuania
Luxembourg
Malaysia
Malta
Mexico
Netherlands
New Zealand
Norway
Philippines
Poland
Portugal
Romania
Singapore
Slovakia
Slovenia
South Africa
South Korea
Sweden
Switzerland
Taiwan
Thailand
Turkey
Ukraine