Introducing Neo4j and graph database concepts
In this section, we will take a look at how data is stored as a graph in Neo4j. We will first introduce what a graph is, what a graph consists of, and how we can query graphs.
Neo4j uses a property graph data model to store the data. The following diagram shows a sample graph created in Neo4j:
Figure 1.1 – Sample graph
Neo4j property graphs can consist of the following features:
- Nodes, which describe the entities of a domain.
- Nodes can have zero or more labels, but a node with no labels is not a normal occurrence. A node with multiple labels represents multiple facets that the node is part of. For example, a node that has labels such as Employee and Manager means this node is an employee who is also a manager.
- A relationship is a connection between two nodes.
- Relationships always have a direction, which is represented using an arrow symbol. The node at the starting point of the arrow is called the start node and the node at the end is called the target node.
- Relationships should have a type, which describes the relationship between the two nodes.
- Both nodes and relationships can have properties, which are key-value pairs.
Let’s take a look at what nodes represent in a graph.
Understanding nodes in graphs
A node is used to represent an entity in the data domain. A sample node in an HR data domain might be as shown in the following figure:
Figure 1.2 – A node in a graph
This node represents a person in an HR data domain. It has two labels. A label can be thought of as something that describes what this node data represents. Here, the labels are Employee and Manager. This can be interpreted as the node representing an employee who is also a manager, with the firstName, lastName, and joinDate properties.
Let’s take a look at what relationships represent in a graph.
Understanding relationships in graphs
A relationship describes how a source node and a target node are related. It is possible for a node to have a relationship with itself.
A relationship has the following aspects:
- It joins a source node and a target node, symbolizing the relationship between these nodes.
- It has a direction, which can be either incoming or outgoing. It represents the relationship direction of the nodes it is connecting.
- It has a type, which represents the nature of the connection between the nodes.
- It can have properties (key-value pairs), which further describe the relationship.
The following diagram represents relationships between employee nodes in HR data:
Figure 1.3 – Relationships between employee nodes
Figure 1.3 represents an employee named John Doe who reports to a manager named Tom Riddle. The REPORTS_TO
string is the type of relationship between the two nodes. The direction of the relationship shows the direction of reporting structure. A relationship can also have properties that can further quantify the type of relationship between the two nodes.