In the last chapter, we looked at memory architecture in CUDA and saw how it can be used efficiently to accelerate applications. Up until now, we have not seen a method to measure the performance of CUDA programs. In this chapter, we will discuss how we can do that using CUDA events. The Nvidia Visual Profiler will also be discussed, as well as how to resolve errors in CUDA programs from within the CUDA code and using debugging tools. How we can improve the performance of CUDA programs will also be discussed. This chapter will describe how CUDA streams can be used for multitasking and how we can use them to accelerate applications. You will also learn how array-sorting algorithms can be accelerated using CUDA. Image processing is an application where we need to process a large amount of data in a very small amount of time, so CUDA can be an ideal choice...
United States
Great Britain
India
Germany
France
Canada
Russia
Spain
Brazil
Australia
Singapore
Hungary
Ukraine
Luxembourg
Estonia
Lithuania
South Korea
Turkey
Switzerland
Colombia
Taiwan
Chile
Norway
Ecuador
Indonesia
New Zealand
Cyprus
Denmark
Finland
Poland
Malta
Czechia
Austria
Sweden
Italy
Egypt
Belgium
Portugal
Slovenia
Ireland
Romania
Greece
Argentina
Netherlands
Bulgaria
Latvia
South Africa
Malaysia
Japan
Slovakia
Philippines
Mexico
Thailand