Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Hands-On Graph Analytics with Neo4j
Hands-On Graph Analytics with Neo4j

Hands-On Graph Analytics with Neo4j: Perform graph processing and visualization techniques using connected data across your enterprise

eBook
$9.99 $35.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Hands-On Graph Analytics with Neo4j

Graph Databases

Graph databases have gained increasing attention in the last few years. Data models built from graphs bring together the simplicity of document-oriented databases and the clarity of SQL tables. Among others, Neo4j is a database that comes with a large ecosystem, including the database, but also tools to build web applications, such as the GRANDstack, and tools to use graph data in a machine learning pipeline, as well as the Graph Data Science Library. This book will discuss those tools, but let's first start from the beginning.

Talking about graph databases means talking about graphs. Even if you do not need to know all the details about graph theory, it’s always a good idea to learn some of the basic concepts underlying the tool you are using. In this chapter, we will start by defining graphs and giving some simple and less simple examples of graphs and their applications. We will then see how to move from the well-known SQL tables to graph data modeling. We’ll conclude by introducing Neo4j and its building blocks, and review some design principles to understand what can and can’t be done with Neo4j.

This chapter will cover the following topics:

  • Graph definition and examples
  • Moving from SQL to graph databases
  • Neo4j: the nodes, relationships, and properties model
  • Understanding graph properties
  • Considerations for graph modeling in Neo4j

Graph definition and examples

The question you may ask at this point is "Why should I care about graphs? After all, my company/business/interest is not about graphs or networks of any kind. I know my data model, well arranged into SQL tables or NoSQL documents, and I can retrieve the information I want when I want." This book will teach you how to empower your data by looking at it in a different way. Surprisingly enough, graphs can be used to model a lot of processes, from the more obvious ones such as road networks, to less intuitive use cases such as video games or credit card fraud detection, among many others.

Graph theory

Let's start from the beginning and answer the question "What is a graph?"

A bit of history: the Seven Bridges of Königsberg problem

Graph studies originate back to Leonhard Euler, a prolific Swiss mathematician who lived in the eighteenth century. In 1735, he published a paper proposing a solution to the Seven Bridges of Königsberg problem. The problem is the following:

Given the city whose geography is depicted in the following image, is there a way to walk across each of the seven bridges of the city once and only once, and return to our starting point?

As you can see, this city is crossed by a river that splits the city into two banks, A and B. The river meander additionally creates two islands, C and D, also part of the city. Those two banks and two islands are connected by a total of seven bridges: two bridges between A and C, two other bridges between C and B, one between C and D, one between B and D, and a last one between D and A:

Euler's reasoning (on the right side) was to reduce this complex geography to the most simple drawing, like the one you can see on the right of the previous image, since the route used within each island is not relevant. Each island then becomes a single point, or node, connected to another by one or several links, or edges, representing the bridges.

With this simple visualization, the mathematician was able to solve the initial problem by noting that, if you arrive at an island (vertex) via one bridge, you will need to leave it using another bridge (except for the start and end vertices). In other words, all vertices but two need to be connected to an even number of relationships. This is not the case in the Königsberg graph, since we have the following:

A: 3 connections (to C twice, and to D once)
B: 3 connections (to C twice, and to D once)
C: 5 connections (to A twice, to B twice and to D once)
D: 3 connections (to A once, to C once and to D once)

This kind of path, where each edge is used once and only once, is called a Eulerian cycle and it can be said that a graph has a Eulerian cycle if and only if all of its vertices have even degrees.

The number of connections for a node is called the degree of the node.

Graph definition

This leads us to the mathematical definition of a graph:

A graph G = (V, E) is a pair of:

  • V, a set of nodes or vertices: the islands in our previous example
  • E, a set of edges connecting nodes: the bridges

The Königsberg graph illustrated on the right of the preceding image can then be defined as follows:

V = [A, B, C, D]
E = [
(A, C),
(A, C),
(C, B),
(C, B),
(A, D),
(C, D),
(B, D)
]

Graphs, like many mathematical objects, are well defined. While it can be difficult to find a good visualization for some of those objects, graphs, on the other hand, suffer from the almost infinite number of ways to draw them.

Visualization

Apart from very special cases, there is no single way to draw a graph and visualize it. Indeed, graphs are most often an abstract representation of reality. For instance, all four graphs depicted in the following image represent the exact same set of nodes and edges, so, by definition, the same mathematical graph:

We cannot rely only on our eyes to find patterns within graphs. For instance, looking only at the lower-right plot, it would be impossible to see the pattern that is visible in the upper-right plot. That's where graph algorithms enter into the game, which will be discussed in more detail in Chapter 6, Node Importance, and Chapter 7, Community Detection and Similarity Measures.

Examples of graphs

Now that we have a better idea of what a graph is, it's time to discover some more examples to understand which purposes graphs can be useful for.

Networks

With the graph definition in mind (a set of nodes connected to each other via edges) and the bridges example from the last section, we can easily imagine how all kinds of networks can be seen as graphs, including road networks, computer networks, or even social networks.

Road networks

Road networks are a perfect example of graphs. In such networks, the nodes are the road intersections, and edges are the roads themselves, as you can see in the following image:

This image shows the road network around Central Park in New York City, wherein streets are edges between junctions representing nodes

With road networks, many questions can be answered with graph analysis, such as the following:

  • What is the shortest path between two points (nodes)?
  • How long is this shortest path?
  • Are there alternative routes?
  • How can you visit all nodes within a list in a minimal amount of time?

This last question is especially important for parcel delivery, in order to minimize the number of driven miles to maximize the number of delivered parcels and satisfied customers.

We will go into more detail about this topic in Chapter 4, The Graph Data Science Library and Path Finding.

Computer networks

In a computer network, each computer/router is a node and the cables between them are the edges. The following image illustrates some possible topologies used for a computer network (credit: https://commons.wikimedia.org/wiki/File:NetworkTopologies.svg):

You can now draw the parallel with the graph definition we discovered in the last section. Here again, the graph structure helps in answering some common questions you may ask yourself about your network:

  • How fast will this information be transferred from A to B? This sounds like a shortest path issue.
  • Which of my nodes is the most critical one? By critical, we mean that if this node is not working for some reason, the whole network will be impacted. Not all nodes have the same impact on the network. That's where centrality algorithms come into the game (see Chapter 6, Node Importance).

Social networks

Facebook, LinkedIn, and all of our favorite social networks use graphs to model their users and interactions. In the most basic example of a social graph, nodes represent people, and edges the friendship or professional relationship between them, as illustrated in the following image:

Here again, graphs allow us to see the data from a different perspective. For instance, we have seen this kind of information when looking at someone’s profile on LinkedIn:

In that case, it tells us that the connected user (me) is just two connections away from Clark Kent. In other words, one person in my network is already connected to a person who is connected to Clark Kent. The following image illustrates this more clearly, in terms of degrees of separation:

You've probably heard about the Six Degrees of Separation theory. In 1929, the Hungarian journalist Frigyes Karinthy proposed a theory according to which each person on Earth is at most six connections away from any other person. In other words, if you want to talk to one person, say Barack Obama, a friend of yours has a friend whose friend has a friend... who knows Barack Obama and can introduce you to him. According to Karinthy, this connection chain must contain less than six connections, or seven people in total, including you and Barack Obama.

Given that there are more than 7 billion human beings on Earth, that's a surprisingly small number! With the large databases that are available nowadays, such as the friendship connections from Facebook or email exchanges from Microsoft, researchers have tried to prove the preceding statement. From the Microsoft email database, for instance, it was shown in 2008 that the average degree of separation between 180 billion distinct pairs of people was around 6.6. But this is just an average, and the number of hops to connect two people could go up to 29 with that dataset.

Many other kinds of analyses can be performed over social graphs:

  • Node importance: Again, it might be very useful to have an idea of which nodes (persons) are the most important. However, the definition of importance here will be different than in the case of a computer network, since it is very unlikely that a single person's retirement from social media makes the whole world collapse. However, influencers have a particular interest for marketing experts.
  • Community detection: Also called clustering, is a way to find a group of nodes sharing some characteristics. For instance, finding users who share the same interests, or visit the same places, can be used to recommend products to them.
  • Link prediction: With a graph, you can think of creating intelligent models to predict whether two entities are likely to be connected in the future. Here again, recommendation engines are one possible application of such a tool.
You can find more information about the Facebook graph as an example at https://developers.facebook.com/docs/graph-api.

As you can see, networks of all kinds are very well suited to graph databases. But we can go far beyond that view and imagine all kinds of data as a graph, which will open up a lot of new perspectives.

Your data is also a graph

You may have noticed that in the previous image of a social graph, the edges have names. Indeed, some people are friends, while some others have a father/son relationship. Now, let's imagine we can have any kind of relationship, meaning we can start connecting different kinds of entities. For instance, a person is living in a particular country, so (s)he is connected to that country with a relationship of type LIVES_IN. Are you beginning to see the point? With that kind of reasoning, the world itself is a graph and your business is a subpart of it.

Graphs are about relationships, and the world is connected, meaning there are relationships everywhere. We’ll talk about this in more detail in Chapter 3, Empowering Your Business with Pure Cypher, which is dedicated to knowledge graphs.

Graph databases allow you to model the data in that way: nodes, connected by relationships of some type. Let's see how to migrate data stored in relational databases to graph databases.

Moving from SQL to graph databases

Before going into detail about Neo4j, let's first have a look at the existing database models. We'll then focus on the most famous one, the relational model, and learn how to migrate from SQL to a graph database model.

Database models

Think about water: depending on the end goal, you will not use the same container. If you want to drink, you might use a glass of water but if you want to have a bath, you will probably choose another one. The choice of container will be different in each scenario. The problem is similar for data: without a proper container, there is nothing we can do with it and we need a proper container depending on the situation, which will not only store data but also contribute to solve the problem we have. This data container is the database.

Drawing an exhaustive list of database types on the market is not impossible but would go far beyond the scope of this book. However, I want to give some examples of the most popular ones, so that you can see where graph databases stand in the big picture:

  • Relational databases: They are by far the most well known type of database. From SQLite to MySQL or PostgreSQL, they use a common query language, called Structured Query Language (SQL), with some variations among the different implementations. They are well established and allow a clear structure of the data. However, they suffer from performance issues when the data grows and are surprisingly not that good at managing complex relationships, since the relationships require many joins between tables.
  • Document-oriented databases: Document-oriented databases, part of the NoSQL (Not Only SQL) era, have gained increasing interest during the last few years. Contrary to relational databases, they can manage flexible data models and are known for better scaling with a large amount of data. Examples of NoSQL databases include MongoDB and Cassandra, but you can find many more on the market.
  • Key-value stores: Redis, RocksDB, and Amazon DynamoDB are examples of key-value databases. They are very simple and known to be very fast, but are not well suited to store complex data.

Here is how the different databases can be viewed in a figurative representation:

Graph databases try to bring the best of each world into a single place by being quite simple to use, flexible, and very performant when it comes to relationships.

SQL and joins

Let's briefly focus on relational databases. Imagine we want to create a Question and Answer website (similar to the invaluable Stack Overflow). The requirements are the following:

  • Users should be able to log in.
  • Once logged in, users can post questions.
  • Users can post answers to existing questions.
  • Questions need to have tags to better identify which question is relevant for which user.

As developers or data scientists, who are used to SQL, we would then naturally start thinking in terms of tables. Which table(s) should I create to store this data? Well, first we will look for entities that seem to be the core of the business. In this QA website, we can identify the following:

  • Users, with attributes: ID, name, email, password
  • Questions: ID, title, text
  • Answers: ID, text
  • Tags: ID, text

With those entities, we now need to create the relationships between them. To do so, we can use foreign keys. For instance, the question has been asked by a given user, so we can just add a new column to the question table, author_id, referencing the user table. The same applies for the answers: they are written by a given user, so we add an author_id column to the Answer table:

It becomes more complicated for tags, since one question can have multiple tags and a single tag can be assigned to many questions. We are in the many-to-many relationship type, which requires adding a join table, a table that is just there to remember that kind of relationship. This is the case of the QuestionTag table in the preceding diagram; it just holds the relationship between tags and questions.

It's all about relationships

The previous exercise probably looks easy to you, because you have followed lectures or tutorials about SQL and use it pretty often in your daily work. But let's be honest, the first time you face a problem and have to create the data model that will allow you to solve it, you will probably draw something that looks like the diagram in the following image, right?

This is one of the powers of graph databases:

Neo4j, as a graph database, allows you to create vertices, or nodes, and the relationships connecting them. The next section summarizes how the different entities in the Neo4j ecosystem are related to each other and how we can structure our data in a graph.

The whiteboard model is your data model.

Neo4j – the nodes, relationships, and properties model

Neo4j is a Java-based highly scalable graph database whose code is publicly available on GitHub at github.com/neo4j/neo4j. This section describes its building blocks and its main cases, but also the cases where it won't perform well because no system is perfect.

Building blocks

As we discussed with graph theory, a graph database such as Neo4j is made of at least two essential building blocks:

  • Nodes
  • Relationships between nodes:
Unlike SQL, Neo4j does not require a fixed and predetermined schema: nodes and relationships can be added on demand.

Let's look at each of of these in detail.

Nodes

In Neo4j, vertices are called nodes. Nodes can be of different types, like Question and Answer were in our former example. In order to differentiate those entities, nodes can have a label. If we continue the parallel with SQL, all nodes with a given label would be in the same table. But the analogy ends here, because nodes can have multiple labels. For instance, Clark Kent is a journalist, but he is also a superhero; the node representing this person can then have the two labels: Journalist and SuperHero.

Labels define the kind of entity the node belongs to. It is also important to store the characteristics of that entity. In Neo4j, this is done by attaching properties to nodes.

Relationships

Like nodes, relationships carry different pieces of information. A relationship between two persons can be of type MARRIED_TO or FRIEND_WITH or many other types of relationship. That's why Neo4j's relationships must have one and only one type.

One of the main powers of Neo4j is that relationships also have properties. For instance, when adding the relationship MARRIED_TO between two persons, we could add the wedding date, place, whether they signed a prenuptial agreement, and so on, as relationship properties.

Be careful: even though we discussed undirected graphs earlier, in Neo4j, all relationships are oriented!

Properties

Properties are saved as key-value pairs where the key is a string capturing the property name. Each value can then be of any of the following types:

  • Number: Integer or Float
  • Text: String
  • Boolean: Boolean
  • Time properties: Date, Time, DateTime, LocalTime, LocalDateTime, or Duration
  • Spatial: Point

Internally, properties are saved as a LinkedList, each element containing a key-value pair. The node or relationship is then linked to the first element of its property list.

Naming conventions: Node labels are written in UpperCamelCase while relationship names use UPPER_CASE with words separated by an underscore. Property names usually use the lowerCamelCase convention.

SQL to Neo4j translator

Here are a few guidelines to be able to easily go from your relational model to a graph model:

SQL world Neo4j world
Table Node label
Row Node
Column Node property
Foreign key Relationship
Join table Relationship
NULL Do not store null values inside properties; just omit the property

Applying those guidelines to the SQL model in the question and answer table diagram, we built up the graph model displayed in the whiteboard model for our simple Q&A website, earlier in the chapter. The full graph model can be seen as follows:

You can use the online tool Arrows, which enables you to draw a graph diagram and export it to Cypher, the Neo4j query language:
http://www.apcjones.com/arrows.

Neo4j use cases

Like any other tool, Neo4j is very good in some situations, but not well suited to others. The basic principle is that Neo4j provides amazing performance for graph traversal. Everything requiring jumping from one node to another is incredibly fast.

On the other hand, Neo4j is probably not the best tool if you want to do the following:

  • Perform full DB scans, for instance, answering the question "What is?"
  • Do full table aggregates
  • Store large documents: the key-values properties list needs to be kept small (let's say no more than around 10 properties)

Some of those pain points can be addressed with a proper graph model. For instance, instead of saving all information as node properties, can we consider moving some of them to another node with a relationship between them? Depending on the kind of requests you are interested in, the graph schema most suited to your application may differ. Before going into the details of graph modeling, we need to stop briefly and and talk about the different kinds of graph properties which will also influence our choice.

Understanding graph properties

Graphs are classified into several categories, depending on the properties of the connections between nodes. These categories are important to consider when modeling your data as a graph, and even more so when you want to run algorithms on them, because the behavior of the algorithm might change and create unexpected results in some situations. Let's discover some of those properties with examples.

Directed versus undirected

So far, we've only talked about graphs as a set of connected nodes. Those connections can be different depending on whether you are going from node A to node B or vice versa. For instance, some streets can only be driven in one direction, right? Or, the fact that you are following someone on Twitter (there is a Follows relationship from you to that user) doesn’t mean that user is also following you. That’s the reason why some relationships are directed. On the contrary, a relationship of type married to is naturally undirected, if A is married to B, usually B is also married to A:

In the right-hand side graph, directed, the transition between B and A is not allowed in that direction

In Neo4j, all relationships are directed. However, Cypher allows you to take this direction into account, or not. That will be explained in more detail in Chapter 2, The Cypher Query Language.

Weighted versus unweighted

As well as direction, relationships can also carry more information. Indeed, in a road network for instance, not all streets have the same importance in a routing system. They have different lengths or occupancy during peak hours, meaning the travel time will be very different from one street to another. The way to model this fact with graphs is to assign a weight to each edge:

In the weighted graph, relationships have weights 16 and 4

Algorithms such as shortest path algorithms take this weight into account to compute a shortest weighted path.

This is not only important for road networks. In a computer network, the distance between units may also have its own importance in terms of connection speed. In social networks, distance is often not the most important property to quantify the strength of a relationship, but we can think of other metrics. For instance, how long have those two people been connected? Or when was the last time user A reacted to a post of user B?

Weights can represent anything related to the relationship between two nodes, such as distance, time, or field-dependent metrics such as the interaction strength between two atoms in a molecule.

Cyclic versus acyclic

A cycle is a loop. A graph is cyclic if you can find at least one path starting from one node and going back to that exact same node.

As you can imagine, it's important to be aware of the presence of cycles in a graph, since they can create infinite loops in a graph traversal algorithm, if we do not pay enough attention to it. The following image shows an acyclic graph versus a cyclic graph:

All those properties are not self-exclusive and can be combined. Especially, we can have directed acyclic graphs (DAG), which usually behave nicely.

Dense versus sparse

Graph density is another property that is important to have in mind. It is related to the average number of connections for each node. In the next diagram, the left graph only has three connections for four nodes, whereas the right graph has six connections for the same amount of nodes, meaning the latter is denser than the former:

Graph traversal

But why worry about density? It is important to realize that everything within graph databases is about graph traversal, jumping from one node to another by following an edge. This can become very time-consuming with a very dense graph, especially when traversing the graph in a breadth-first way. The following image shows the graph traversals in which breadth-first search is compared to depth-first search:

Connected versus disconnected

Last but not least is the notion of connectivity or a connected graph. In the next diagram, it is always possible to go from one vertex to another, whatever pair of vertices is considered. This graph is said to be connected. On the other side of the figure, you can see that D is isolated - there is no way to go from D to A, for example. This graph is disconnected, and we can even say that it has two components. We will cover the analysis of this kind of structure in Chapter 7, Community Detection and Similarity Measures. The following image shows connected versus disconnected graphs:

This is a non-exhaustive list of graph properties, but they are the main ones we will have to worry about within our graph database adventure. Some of them are important when creating the whiteboard graph model for our data, as we will discuss in the next section.

Considerations for graph modeling in Neo4j

The whiteboard model for our simple Q&A website is the very first approach to this problem. Depending on the kind of questions you want your application to answer, and in order to take full advantage of Neo4j, the schema can be very different.

As a rule of thumb, we consider nodes as entities or objects, while relationships are verbs.

Relationship orientation

As we discussed earlier in the chapter, relationships in Neo4j are oriented. However, with Cypher, we can build queries that are orientation-independent (see Chapter 2, The Cypher Query Language). In that scenario, a relationship that is imperatively bijective should not be stored twice in the database. In short, a friendship relationship should be created only in one direction, as illustrated in the following image:

Node or property?

Another question that arises quite often and early in the choice of graph model is whether a (categorical) characteristic should be stored as a node property, or whether it should have its own node.

Nodes are the following:

  • Explicit: Nodes appear in the graph schema, which makes them more visible than properties.
  • Performant: If we need to find nodes sharing the same characteristics, it will be much faster with a node than by filtering on properties.

On the other hand, properties are the following:

  • Simpler: It's just a property with a value, not a complex node and relationship structure. If you don't need to perform special analysis on the property and just want it to be available when working on the node it belongs to, a property is just fine.
  • Indexable: Neo4j supports indexing for properties. It is used to find the first node in a graph traversal and is very efficient (more details in Chapter 2, The Cypher Query Language).

As you can see, there is no universal answer to this question, and the solution will depend on your use case. It is usually recommended to list them and try to write and test the associated queries in order to make sure you are not falling into a Neo4j pitfall.

For more information and more examples on graph modeling, I encourage you to have a look at Neo4j Graph Data Modeling (see the Further reading section).

Summary

In this chapter, we discussed the mathematical definition of a graph and how it is related to graph databases. We then moved on to the specific instance of graph databases we will use throughout this book, Neo4j. We've learned about its building blocks, nodes having labels and properties, and relationships that must have a type and, optionally, some properties as well. We concluded with some important explanations about different types of graphs, weighted, directed, cyclic, or connected, and how this can help in defining the data model best suited to a given use case.

In the next chapter, we are going to look at Cypher, the query language used by Neo4j, and how we can feed the database and retrieve data from it.

Further reading

  • If you want to learn more about the mathematics behind graph theory, you can start with Graph Theory with Applications, J.A. Bondy and U.S.R. Murty, Elsevier, whose first edition is available for free on the web, for instance at https://www.freetechbooks.com/graph-theory-with-applications-t559.
  • For more information about choosing the appropriate schema for your graph data, see Neo4j Graph Data Modeling, M. Lal, Packt Publishing.
  • Some references to learn more about degrees of separation:

  • Paper about the Microsoft study: Planetary-Scale Views on an Instant-Messaging Network by Jure Leskovec and Eric Horvitz
  • Facebook research blog post: https://research.fb.com/blog/2016/02/three-and-a-half-degrees-of-separation/
Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Get up and running with graph analytics with the help of real-world examples
  • Explore various use cases such as fraud detection, graph-based search, and recommendation systems
  • Get to grips with the Graph Data Science library with the help of examples, and use Neo4j in the cloud for effective application scaling

Description

Neo4j is a graph database that includes plugins to run complex graph algorithms. The book starts with an introduction to the basics of graph analytics, the Cypher query language, and graph architecture components, and helps you to understand why enterprises have started to adopt graph analytics within their organizations. You’ll find out how to implement Neo4j algorithms and techniques and explore various graph analytics methods to reveal complex relationships in your data. You’ll be able to implement graph analytics catering to different domains such as fraud detection, graph-based search, recommendation systems, social networking, and data management. You’ll also learn how to store data in graph databases and extract valuable insights from it. As you become well-versed with the techniques, you’ll discover graph machine learning in order to address simple to complex challenges using Neo4j. You will also understand how to use graph data in a machine learning model in order to make predictions based on your data. Finally, you’ll get to grips with structuring a web application for production using Neo4j. By the end of this book, you’ll not only be able to harness the power of graphs to handle a broad range of problem areas, but you’ll also have learned how to use Neo4j efficiently to identify complex relationships in your data.

Who is this book for?

This book is for data analysts, business analysts, graph analysts, and database developers looking to store and process graph data to reveal key data insights. This book will also appeal to data scientists who want to build intelligent graph applications catering to different domains. Some experience with Neo4j is required.

What you will learn

  • Become well-versed with Neo4j graph database building blocks, nodes, and relationships
  • Discover how to create, update, and delete nodes and relationships using Cypher querying
  • Use graphs to improve web search and recommendations
  • Understand graph algorithms such as pathfinding, spatial search, centrality, and community detection
  • Find out different steps to integrate graphs in a normal machine learning pipeline
  • Formulate a link prediction problem in the context of machine learning
  • Implement graph embedding algorithms such as DeepWalk, and use them in Neo4j graphs

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Aug 21, 2020
Length: 510 pages
Edition : 1st
Language : English
ISBN-13 : 9781839212611
Vendor :
Neo Technology
Category :
Languages :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Aug 21, 2020
Length: 510 pages
Edition : 1st
Language : English
ISBN-13 : 9781839212611
Vendor :
Neo Technology
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 151.97
Hands-On Graph Analytics with Neo4j
$48.99
40 Algorithms Every Programmer Should Know
$49.99
Graph Machine Learning
$52.99
Total $ 151.97 Stars icon
Banner background image

Table of Contents

17 Chapters
Section 1: Graph Modeling with Neo4j Chevron down icon Chevron up icon
Graph Databases Chevron down icon Chevron up icon
The Cypher Query Language Chevron down icon Chevron up icon
Empowering Your Business with Pure Cypher Chevron down icon Chevron up icon
Section 2: Graph Algorithms Chevron down icon Chevron up icon
The Graph Data Science Library and Path Finding Chevron down icon Chevron up icon
Spatial Data Chevron down icon Chevron up icon
Node Importance Chevron down icon Chevron up icon
Community Detection and Similarity Measures Chevron down icon Chevron up icon
Section 3: Machine Learning on Graphs Chevron down icon Chevron up icon
Using Graph-based Features in Machine Learning Chevron down icon Chevron up icon
Predicting Relationships Chevron down icon Chevron up icon
Graph Embedding - from Graphs to Matrices Chevron down icon Chevron up icon
Section 4: Neo4j for Production Chevron down icon Chevron up icon
Using Neo4j in Your Web Application Chevron down icon Chevron up icon
Neo4j at Scale Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6
(9 Ratings)
5 star 66.7%
4 star 22.2%
3 star 11.1%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




mayanktripathi4u Nov 24, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Disclaimer: The Publisher asked me to review this book and sent me a review copy. I promise to be completely honest about my thoughts on this book.This book describes the concepts of Neo4j with good examples, specially I liked the example of comparing data with water, a container with database and it very nicely explained the way of choosing an appropriate container based on need.Further this book also gives us a clear understanding on Cypher Query Language with sample code. Covered almost all of the concepts starting from creating a node to managing it and modify / update them, including loading the data from CSV or JSON format or web APIs. From time-to-time the author has also taken care of comparing Neo4j with other database terminology and shared related terms considering that this book will also be read by various database users, and it made me more comfortable when doing comparison with the database I am used to. In this book the author has demonstrated the power of Neo4j by building a knowledge graph from unstructured data with detailed explanation of geographic coordinate systems. Which is very hard to do with various other databases. Author did not leave us with just the concepts and hands-on with Neo4j, she also taken us from novice to expert and included Data Science/Machine Learning portion as well which is booming now!Another important thing I liked is that the author has taken care of small concepts & tips and has explained them clearly, as they are the building blocks. Along with this, there are some exercises to test the understanding. "Tip" section in the book is awesome, as it contains very important tips which one has to keep them handy. This book is worth reading irrespective who you are, be it a Beginner (to gain fundamental knowledge on Neo4j); Full-stack developer (Where and when to use Neo4j with Python or any other language); database developer (how to create; manage and maintain database); or a data scientist & ML engineer (Data Analysis and Prediction with Python and Neo4j); etc. It covers all aspects of Neo4j.I loved reading this book.
Amazon Verified review Amazon
Joao FirJur Aug 12, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Atende ao que preciso, sobre a aplicação da Teoria dos Grafos na Computação. Não encontrei falhas, até o momento.
Amazon Verified review Amazon
Practical designer Nov 01, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I received a copy of this book from the publisher so that I could review it.This book does a great job of introducing a range of graph-related topics along with code examples to illustrate the points. It is a practical guide written by someone who is well grounded in theory. The author's experience as a data scientist helps her bring together various tools in the Neo4j ecosystem that you might use on a real project. You aren't just getting a single vendor's sales pitch. Sometimes data scientists struggle with the ML Ops required to turn a great model into a production tool. The book takes the reader all the way through ideas for deploying graph analytics in an application and tuning the database in production. This sets the book apart from others that I have read on applied graph analytics.The section on spatial data is especially well done, as the author has been a major contributor to the Neo4j community in that area. The chapters on machine learning are also deep and valuable.In order to explain graph algorithms, the book provides examples in pure Python before showing the way to call Neo4j's implementation. I found myself skimming/skipping the pure Python explanations since Neo4j provides these algorithms out of the box.The book attempts to explain concepts from the ground up, but you will get more from this book if you have some experience with machine learning, Neo4j, and Python. For a data scientist or analyst looking to add to graph analytics to your tool kit, this will be a valuable resource and a source of inspiration.
Amazon Verified review Amazon
Michael Porter Oct 01, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Estelle's book is an incredibly clear breakdown of how to make the most out of Neo4j and all it's capabilities. She does a fantastic job of explaining how to conceptualize "graphy" problems and then use Neo4j to solve them. If you're new to graphs and Neo4j or if you're an experienced graphista, this books is going to have something for you.
Amazon Verified review Amazon
Yusuf Sep 21, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Estelle did a great job introducing Neo4j from the lenses of graph analytics.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.