Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On GPU Programming with Python and CUDA

You're reading from   Hands-On GPU Programming with Python and CUDA Explore high-performance parallel computing with CUDA

Arrow left icon
Product type Paperback
Published in Nov 2018
Publisher Packt
ISBN-13 9781788993913
Length 310 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Dr. Brian Tuomanen Dr. Brian Tuomanen
Author Profile Icon Dr. Brian Tuomanen
Dr. Brian Tuomanen
Arrow right icon
View More author details
Toc

Table of Contents (15) Chapters Close

Preface 1. Why GPU Programming? 2. Setting Up Your GPU Programming Environment FREE CHAPTER 3. Getting Started with PyCUDA 4. Kernels, Threads, Blocks, and Grids 5. Streams, Events, Contexts, and Concurrency 6. Debugging and Profiling Your CUDA Code 7. Using the CUDA Libraries with Scikit-CUDA 8. The CUDA Device Function Libraries and Thrust 9. Implementation of a Deep Neural Network 10. Working with Compiled GPU Code 11. Performance Optimization in CUDA 12. Where to Go from Here 13. Assessment 14. Other Books You May Enjoy

Why GPU Programming?

It turns out that besides being able to render graphics for video games, graphics processing units (GPUs) also provide a readily accessible means for the general consumer to do massively parallel computing—an average person can now buy a $2,000 modern GPU card from a local electronics store, plug it into their PC at home, and then use it almost immediately for computational power that would only have been available in the supercomputing labs of top corporations and universities only 5 or 10 years ago. This open accessibility of GPUs has become apparent in many ways in recent years, which can be revealed by a brief observation of the news—cryptocurrency miners use GPUs to generate digital money such as Bitcoins, geneticists and biologists use GPUs for DNA analysis and research, physicists and mathematicians use GPUs for large-scale simulations, AI researchers can now program GPUs to write plays and compose music, while major internet companies, such as Google and Facebook, use farms of servers with GPUs for large-scale machine learning tasks… the list goes on and on.

This book is primarily aimed at bringing you up to speed with GPU programming, so that you too may begin using their power as soon as possible, no matter what your end goal is. We aim to cover the core essentials of how to program a GPU, rather than provide intricate technical details and schematics of how a GPU works. Toward the end of the book, we will provide further resources so that you may specialize further, and apply your new knowledge of GPUs. (Further details as to particular required technical knowledge and hardware follow this section.)

In this book, we will be working with CUDA, a framework for general-purpose GPU (GPGPU) programming from NVIDIA, which was first released back in 2007. While CUDA is proprietary for NVIDIA GPUs, it is a mature and stable platform that is relatively easy to use, provides an unmatched set of first-party accelerated mathematical and AI-related libraries, and comes with the minimal hassle when it comes to installation and integration. Moreover, there are readily available and standardized Python libraries, such as PyCUDA and Scikit-CUDA, which make GPGPU programming all the more readily accessible to aspiring GPU programmers. For these reasons, we are opting to go with CUDA for this book.

CUDA is always pronounced coo-duh, and never as the acronym C-U-D-A! CUDA originally stood for Compute Unified Device Architecture, but Nvidia has dropped the acronym and now uses CUDA as a proper name written in all-caps.

We will now start our journey into GPU programming with an overview of Amdahl's Law. Amdahl's Law is a simple but effective method to estimate potential speed gains we can get by offloading a program or algorithm onto a GPU; this will help us determine whether it's worth our effort to rewrite our code to make use of the GPU. We will then go over a brief review of how to profile our Python code with the cProfile module, to help us find the bottlenecks in our code.

The learning outcomes for this chapter are as follows:

  • Understand Amdahl's Law
  • Apply Amdahl's Law in the context of your code
  • Using the cProfile module for basic profiling of Python code
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime