Final words
This brings us to the end of this journey. I’ve tried to pack lots of useful information, tips, tricks, and diagrams into this book, but there’s also plenty of room for much more research and experimentation on your end! If you haven’t done so already, go back and try the various exercises I’ve proposed throughout. Explore new things with the Arrow datasets and compute APIs, and try using Arrow Flight and ADBC in your work.
Across the various chapters in this book, we’ve covered a lot of things:
- The Arrow format specification
- Using the various Arrow libraries to improve many aspects of analytical computation and data science
- Inter-process communication and sharing memory
- Using Apache Spark, pandas, and Jupyter in conjunction with Arrow
- The differences between data storage formats and in-memory runtime formats
- Passing data across the boundaries of programming languages without having to copy it
- Using gRPC...