GeoMesa is an open source product designed to leverage the distributed nature of storage systems, such as Accumulo and Cassandra, to hold a distributed spatio-temporal database. With this design, GeoMesa is capable of running the large-scale geospatial analytics that are required for very large data sets, including GDELT.
We are going to use GeoMesa to store GDELT data and run our analytics across a large proportion of that data; this should give us access to enough data to train our model so that we can predict the future rise and fall of oil prices. Also, GeoMesa will enable us to plot large amounts of points on a map, so that we can visualize GDELT and any other useful data.
There is a very good tutorial on the GeoMesa website ( that guides the user through the installation process. Therefore, it is not our intention here to produce another how-to guide; there are, however, a few points worth noting that may save you time in getting everything up and running...