Packt+ | Advance your knowledge in tech

You're reading from Hands-On Graph Analytics with Neo4j Perform graph processing and visualization techniques using connected data across your enterprise

Product type Paperback

Published in Aug 2020

Publisher Packt

ISBN-13 9781839212611

Length 510 pages

Edition 1st Edition

Languages

Cypher

Tools

Neo4j

Concepts

Database Programming

Author (1):

Estelle Scifo

View More author details

Neo4j – the nodes, relationships, and properties model

Neo4j is a Java-based highly scalable graph database whose code is publicly available on GitHub at github.com/neo4j/neo4j. This section describes its building blocks and its main cases, but also the cases where it won't perform well because no system is perfect.

Building blocks

As we discussed with graph theory, a graph database such as Neo4j is made of at least two essential building blocks:

Nodes
Relationships between nodes:

Unlike SQL, Neo4j does not require a fixed and predetermined schema: nodes and relationships can be added on demand.

Let's look at each of of these in detail.

Nodes

In Neo4j, vertices are called nodes. Nodes can be of different types, like Question and Answer were in our former example. In order to differentiate those entities, nodes can have a label. If we continue the parallel with SQL, all nodes with a given label would be in the same table. But the analogy ends here, because nodes can have multiple labels. For instance, Clark Kent is a journalist, but he is also a superhero; the node representing this person can then have the two labels: Journalist and SuperHero.

Labels define the kind of entity the node belongs to. It is also important to store the characteristics of that entity. In Neo4j, this is done by attaching properties to nodes.

Relationships

Like nodes, relationships carry different pieces of information. A relationship between two persons can be of type MARRIED_TO or FRIEND_WITH or many other types of relationship. That's why Neo4j's relationships must have one and only one type.

One of the main powers of Neo4j is that relationships also have properties. For instance, when adding the relationship MARRIED_TO between two persons, we could add the wedding date, place, whether they signed a prenuptial agreement, and so on, as relationship properties.

Be careful: even though we discussed undirected graphs earlier, in Neo4j, all relationships are oriented!

Properties

Properties are saved as key-value pairs where the key is a string capturing the property name. Each value can then be of any of the following types:

Number: Integer or Float
Text: String
Boolean: Boolean
Time properties: Date, Time, DateTime, LocalTime, LocalDateTime, or Duration
Spatial: Point

Internally, properties are saved as a LinkedList, each element containing a key-value pair. The node or relationship is then linked to the first element of its property list.

Naming conventions: Node labels are written in UpperCamelCase while relationship names use UPPER_CASE with words separated by an underscore. Property names usually use the lowerCamelCase convention.

SQL to Neo4j translator

Here are a few guidelines to be able to easily go from your relational model to a graph model:

SQL world	Neo4j world
Table	Node label
Row	Node
Column	Node property
Foreign key	Relationship
Join table	Relationship
NULL	Do not store null values inside properties; just omit the property

Applying those guidelines to the SQL model in the question and answer table diagram, we built up the graph model displayed in the whiteboard model for our simple Q&A website, earlier in the chapter. The full graph model can be seen as follows:

You can use the online tool Arrows, which enables you to draw a graph diagram and export it to Cypher, the Neo4j query language:
http://www.apcjones.com/arrows.

Neo4j use cases

Like any other tool, Neo4j is very good in some situations, but not well suited to others. The basic principle is that Neo4j provides amazing performance for graph traversal. Everything requiring jumping from one node to another is incredibly fast.

On the other hand, Neo4j is probably not the best tool if you want to do the following:

Perform full DB scans, for instance, answering the question "What is?"
Do full table aggregates
Store large documents: the key-values properties list needs to be kept small (let's say no more than around 10 properties)

Some of those pain points can be addressed with a proper graph model. For instance, instead of saving all information as node properties, can we consider moving some of them to another node with a relationship between them? Depending on the kind of requests you are interested in, the graph schema most suited to your application may differ. Before going into the details of graph modeling, we need to stop briefly and and talk about the different kinds of graph properties which will also influence our choice.