Collaboration breeds success
The open source community and ecosystem are huge, really huge. You can measure the success of a given project by its adoption and usage, which, for an interoperability project such as Arrow, heavily requires collaboration with other projects. Way back in Chapter 1, Getting Started with Apache Arrow, when we covered the data types in the format, I mentioned a couple of types that were newer and more recently added to the specification through collaboration with other projects: RunEndEncoded
, StringView
and ListView
.
Data processing has changed significantly over the decades due to a variety of factors. Two advances that have produced the biggest changes are the increases in main-memory sizes, such as RAM, and storage performance compared to the price, such as solid-state drives (SSDs). These hardware trends end up significantly changing the development and evolution of databases and data processing, leading to new ways of structuring and representing...