Playing with data, wherever it might be!
Modern data science, machine learning (ML), and other data manipulation techniques frequently require data to be merged from multiple locations to perform tasks. Often, this data isn’t locally accessible but rather is stored in some form of cloud storage. Most of the implementations of the Arrow libraries provide native support for local filesystem access, Amazon Web Services Simple Storage Service (AWS S3), Microsoft Azure FileSystem, Google Cloud Storage (GCS), and Hadoop Distributed File System (HDFS). In addition to the natively supported systems, filesystem interfaces are generally implemented or used in language-specific cases to make it easy to add support for other filesystems.
Once you’re able to access the platform your files are located on (whether that is local, in the cloud, or otherwise), you need to make sure that the data is in a format that is supported by the Arrow libraries for importing. Check the documentation...