Using the Arrow C data interface
Back in Chapter 2, Working with Key Arrow Specifications, I mentioned the Arrow C data interfaces in regard to the communication of data between Python and Spark processes. At that point, we didn't go much into detail about the interface or what it looks like; now, we will.
Because the Arrow project is fast-moving and evolving, it can sometimes be difficult for other projects to incorporate the Arrow libraries into their work. There's also the case where there might be a lot of existing code that needs to be adapted to work with Arrow piecemeal, leading to you having to create or even re-implement adapters for interchanging data. To avoid redundant efforts across these situations, the Arrow project defines a very small, stable set of C definitions that can be copied into a project to allow to easily pass data across the boundaries of different languages and libraries. For languages and runtimes that aren't C or C++, it should still...