The property graph model
In Neo4j, we use a property graph model to represent information. The property graph model is an extension of the graphs from mathematics. The following figure gives an example of how data from Figure 1.1 can be represented in Neo4j:
The preceding figure introduces the following concepts that we use to model a property graph:
- Nodes: Entities are modelled as nodes. In Figure 1.2, London, Bob, Alice are all entities.
- Labels: These are used to represent the role of the node in our domain. A node can have multiple labels at the same time. Apart from adding more meaning to nodes, labels are also used to add constraints and indices that are local to the particular label. In the preceding figure, :Person and :Location are the two labels that we used. We can add an index or constraint on name for each of these labels, which will result in two separate indices—one for :Location and the other for :Person.
- Relationships: These depict directed, semantically relevant connections between two nodes. A relationship in Neo4j will always have a start node, an end node, and a single type. While relationships need to be created with a direction, we can ignore the direction while traversing them. :LIVES_IN and :IS_MARRIED_TO in Figure 1.2 are relationship types.
- Properties: These are key-value pairs that contain information about the node or relationship. In the previous figure, name and since are both properties that divulge more information about the node or relationship they are associated with. Neo4j can accept any Java Virtual Machine (JVM) type as a property, including but not limited to, date, string, double, and arrays.
This property graph model allows us to model data as close to the real world as possible.
The resultant model is simpler and more expressive. It also explicitly calls out relationships. In contrast to an RDBMS, which uses foreign keys to imply relationships, having them explicitly defined allows us to retrieve data by traversing relationships to find the information we need. This is a deliberate, practical algorithmic approach that uses the connectedness of data, rather than relying on some index lookups or joins to find the related data. Explicit relationships also make the property graph model a natural fit for most problem domains, as they are interconnected.