Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Mastering C++ Multithreading

You're reading from   Mastering C++ Multithreading Write robust, concurrent, and parallel applications

Arrow left icon
Product type Paperback
Published in Jul 2017
Publisher Packt
ISBN-13 9781787121706
Length 244 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Maya Posch Maya Posch
Author Profile Icon Maya Posch
Maya Posch
Arrow right icon
View More author details
Toc

Table of Contents (11) Chapters Close

Preface 1. Revisiting Multithreading FREE CHAPTER 2. Multithreading Implementation on the Processor and OS 3. C++ Multithreading APIs 4. Thread Synchronization and Communication 5. Native C++ Threads and Primitives 6. Debugging Multithreaded Code 7. Best Practices 8. Atomic Operations - Working with the Hardware 9. Multithreading with Distributed Computing 10. Multithreading with GPGPU

GPU memory management


When using a CPU, one has to deal with a number of memory hierarchies, in the form of the main memory (slowest), to CPU caches (faster), and CPU registers (fastest). A GPU is much the same, in that, one has to deal with a memory hierarchy that can significantly impact the speed of one's applications.

Fastest on a GPU is also the register (or private) memory, of which we have quite a bit more than on the average CPU. After this, we get local memory, which is a memory shared by a number of processing elements. Slowest on the GPU itself is the memory data cache, also called texture memory. This is a memory on the card that is usually referred to as Video RAM (VRAM) and uses a high-bandwidth, but a relatively high-latency memory such as GDDR5.

The absolute slowest is using the host system's memory (system RAM), as this has to travel across the PCIe bus and through various other subsystems in order to transfer any data. Relative to on-device memory systems, host-device communication...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image