Summary
Now we have learned to write programs with Accelerate that run using the interpreter, and to compile and run them on CUDA-enabled GPUs. We know that Accelerate uses a code generator of its own internally. We understand it's crucial to write code that can efficiently reuse cached CUDA kernels, because their compilation is very expensive. We also learned that tuples are a free abstraction in Accelerate, although GPUs themselves don't directly support tupled elements.
In the next chapter, we will dive into Cloud Haskell and distributed programming using Haskell. It turns out Haskell is a pretty well-suited language for programming distributed systems. Cloud Haskell is an effort that streamlines building distributed applications, providing an abstraction over the network layer, among other things.