Chapter 3: Data Science with Apache Arrow
So far, we've covered the Apache Arrow format and how to read various types of data from local disks or cloud storage into Arrow-formatted memory, but if you aren't the one actually building tools and utilities for others to use, then what does this mean for you? You'll be able to benefit from things that people will build using Arrow, such as new fancy libraries, performance enhancements, and utilities. But, how can you materially change your workflow to get some of these improvements right now? That's what we're going to be covering in this chapter, specific examples of Arrow enhancing existing data science workflows and enabling new ones.
In this chapter, we'll look at the following topics:
- How Open Database Connectivity (ODBC) is being improved upon and will eventually, hopefully, be rendered obsolete by Arrow communication protocols
- Leveraging the topics we covered in the previous chapters with...