Storing a graph in Neo4j
With our graph database set up, and our methods for interacting with Neo4j written, we can start to use Python and Neo4j to store and explore our graph data.
In this section, we will be looking at an air travel network between the US and Canada and analyzing its properties to find efficient routes between locations.
Preprocessing data
To begin, let’s take a look at our data (sourced from Stanford University: https://snap.stanford.edu/data/reachability.html). We have two files, reachability_250.txt
and reachability-meta.csv
.
If we open reachability-meta.csv
and take a look at the first few lines, we’ll find a list of information about cities in the US and Canada:
"node_id","name","metro_pop","latitude","longitude" 0,"Abbotsford, BC",133497.0,49.051575,-122.328849 1,"Aberdeen, SD",40878.0,45.45909,-98.487324 2,"Abilene, TX",166416.0,32.449175,-99.741424...