Chapter 5. Spark for Geographic Analysis
Geographic processing is a powerful use case for Spark and therefore the aim of this chapter is to explain how data scientists can process geographic data using Spark to produce powerful, map-based views of very large datasets. We will demonstrate how to process spatio-temporal datasets easily via Spark integrations with GeoMesa, which helps turn Spark into a sophisticated geographic processing engine. As the Internet of Things (IoT) and other location-aware datasets become ever more common, and moving objects data volumes climb, Spark will become a critical tool that closes the geoprocessing gap that exists between spatial functionality and processing scalability. This chapter reveals how to conduct advanced geopolitical analysis of global news with a view to leveraging the data to analyze and perform data science on oil prices.
In this chapter, we will cover the following topics:
- Using Spark to ingest and preprocess geolocated data
- Storing...