Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Graph Data Modeling in Python
Graph Data Modeling in Python

Graph Data Modeling in Python: A practical guide to curating, analyzing, and modeling data with graphs

Arrow left icon
Profile Icon Gary Hutson Profile Icon Matt Jackson
Arrow right icon
$44.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8 (6 Ratings)
Paperback Jun 2023 236 pages 1st Edition
eBook
$9.99 $35.99
Paperback
$44.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Gary Hutson Profile Icon Matt Jackson
Arrow right icon
$44.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8 (6 Ratings)
Paperback Jun 2023 236 pages 1st Edition
eBook
$9.99 $35.99
Paperback
$44.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$9.99 $35.99
Paperback
$44.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Table of content icon View table of contents Preview book icon Preview Book

Graph Data Modeling in Python

Introducing Graphs in the Real World

Social network analysis, fraud detection, modeling the stability of systems (for example, rail and energy grids), and recommendation systems all rely on graphs as the lynchpin underpinning these types of networks. In each of these examples, the relationships between individual people, bank accounts, or other single units are fundamental to describe and model the data. As opposed to traditional data models, a graph is a perfect way to represent groups of interacting elements.

This chapter will serve as an introduction to why graphs are important and introduce you to the fundamentals of what makes up a graph network. Moreover, we will look at how to transition from traditional data storage strategies, such as relational databases (RDBs), to how you can use this knowledge to work with graph databases (GDBs). Throughout this book, we will be working with a popular graph database, namely Neo4j. This will be followed by an explanation of how graphs are utilized in the real world and then a gentle introduction to working with the main package workhorses, known as igraph and NetworkX, which are the best and most stable graph packages for graph data analysis and modeling.

In this chapter, we’re going to cover the following topics:

  • Why should you use graphs?
  • The fundamentals of nodes and edges and the properties of a graph
  • Comparing RDBs and GDBs
  • The use of graphs across various industries
  • Introduction to NetworkX and igraph

Technical requirements

We will be using the Jupyter Notebook to run our coding exercises. For this, you will require python>=3.8.0, along with the following packages, which will need to be installed in your environment with the pip install command:

  • networkx==2.8.8
  • igraph==0.9.8

All notebooks, along with the coding exercises, are available at the following GitHub link: https://github.com/PacktPublishing/Graph-Data-Modeling-in-Python/tree/main/CH01.

Why should you use graphs?

In modern, data-driven solutions and enterprises, graph data structures are becoming more and more common. This is because, in our modern, data-driven world, relationships between things are becoming as, if not more important, than the things themselves. In modern industries and enterprises, graphs are starting to become more common and powerful in understanding the relationships between entities. I would say that these relationships and how they are connected have become more important than the entities themselves. We will demonstrate examples of real-life graphs in our use cases in the following chapters with detailed instructions on how to build these networks and the core considerations you need to make for the graph design.

Graphs are fundamental to many systems we use every day. Each time you are online and receive a product recommendation, it is likely that a graph solution is powering this recommendation. This is why learning how to work with graph data and leveraging these types of networks is a fast-growing and key skill in data science.

Composite components of a graph

Networks are a tool to represent complex systems and the complex nature of the connections arising in today’s data. We have already referenced how graphs are powering some of the big powerhouse recommendation systems in action today.

Graph methods tend to fall into four different areas:

  • Movement: Movement is concerned with how things travel (move) through a network. These types of graphs are the drivers behind routing and GPS solutions and are utilized by the biggest players in finding the optimal path across a road network.
  • Influence: On social media, this area specifies who the known influencers are and how they propagate this influence across a network.
  • Groups and interactions: This area involves identifying groups and how actors in the network interact with each other. We will look at an example of how to apply community detection methods to find these communities through the node (the actor involved) and its connections (the edges). Don’t worry if you don’t know what these terms are for now; we will focus on these in the Fundamentals of nodes, edges, and the properties of a graph section.
  • Pattern detection: Pattern detection involves using a graph to find similarities in the network that can be explored. We must look at this from the actor’s (node’s) point of view and find similarities between that actor and other actors in the network. Here, actor is taken to mean person, author profile, and so on.

In this section, we have explained the core components of a graph by providing simple working definitions. In the following section, we will delve deeper into these fundamental elements, which make up every graph you will come across in the industry. We will look at nodes, edges, and the various properties of a graph.

The fundamentals of nodes and edges and the properties of a graph

Graphs, or networks, are particularly powerful data structures for modeling and describing relationships between things, whether these things are people, products, or machines. In a graph, those things that we coined earlier are represented by nodes (sometimes known as vertices). Nodes are connected by edges (sometimes referred to as relationships). In a network, an edge represents a relationship between two things, indicating that, somehow, they are linked.

The following sections will look at the structures and types of graphs. First, we will start with undirected graphs before moving on to directed graphs. After that, we will look at node properties, then delve into heterogeneous graphs, and end by looking at schema design considerations.

Undirected graphs

To illustrate, a simple example is that of a real-life social network. In the following example, Jeremy and Mark are each represented by a node. Jeremy is friends with Mark, and the friend relationship is represented by an edge connecting the two nodes. The following diagram shows an undirected graph:

Figure 1.1 – Two friend nodes are linked together with a single edge

Figure 1.1 – Two friend nodes are linked together with a single edge

However, not all social networks are the same, and in some online social media platforms, relationships between users of a social network may not be mutual.

For example, on Twitter, one user can follow another, but this doesn’t mean the inverse must be true. On Twitter, Jeremy may follow Mark, but Mark may not follow Jeremy.

Directed graphs

Here, a directional edge is used to show that Jeremy follows Mark, while the absence of an edge in the reverse direction shows that Mark does not follow Jeremy in return:

Figure 1.2 – Two friend nodes are linked together with a single edge

Figure 1.2 – Two friend nodes are linked together with a single edge

This type of graph is known as a directed graph. For reference, sometimes, undirected edges like those in Figure 1.1 are shown as bidirectional edges, pointing to both nodes. This bidirectional representation is equivalent to an undirected edge in most senses and represents a mutual relationship.

Importantly, when creating a data model with directional edges, naming relationships appropriately becomes important. In our Twitter example, if the edge representing the interaction between Mark and Jeremy is follows, then the edge goes from the Jeremy node to the Mark node.

On the other hand, if the edge represents a concept such as followed by, then this should be in the other direction – that is, from Mark to Jeremy. This has particularly strong implications for some more complex graph modeling and use cases, which we will cover in Chapter 2, Working with Graph Data Models.

Node properties

While nodes and edges (directional or not) are the basic building blocks of a graph, they are often not sufficient to fully describe a dataset. Nodes can have data associated with them that may not be relational, so it would not be expressed as a relationship with another node.

In these cases, to represent data associated with nodes, we can use node properties (sometimes known as node attributes). Similarly, where an edge has additional information associated with it, in addition to representing a relationship, edge properties can be used to hold that data.

The following diagram shows a black line, indicating that Jeremy follows Mark but that Mark does not follow Jeremy – therefore, the black line indicates directionality:

Figure 1.3 – Two friend nodes are linked together with a single edge

Figure 1.3 – Two friend nodes are linked together with a single edge

In the preceding model, node properties are used to describe the number of followers Mark and Jeremy have, as well as the locations listed in their Twitter bios. This kind of additional node information is particularly important for querying graph data when asking questions that involve filtering.

In our Twitter example, properties would need to be present in the graph if, for example, we wanted to know who followed users with above 1,000 followers. We will revisit answering graphical questions using nodes, edges, and properties in later chapters.

Depending on the dataset, there may be cases where different nodes have different sets of properties. In this case, it is common to have several distinct types of nodes in the same graph.

Heterogeneous graphs

Node types can also be referred to as layers, or nodes with different labels, though for this book, they will be known simply as types.

The following diagram shows the nodes representing Jeremy and Mark as people, where each node type has different properties, and there are multiple relationship types. Due to this, we can term these multiple relationships as heterogenous:

Figure 1.4 – Example of a heterogenous Twitter graph

Figure 1.4 – Example of a heterogenous Twitter graph

Now, we have added nodes representing Mark and Jeremy as people, relationships representing their relationship outside of Twitter, and their ownership of their respective accounts. Note that since we have increased the number of node types, we also need new edge types to refer to the different interactions between different types of nodes.

Graphs with multiple node types are known as heterogeneous, multilayer, or multilevel graphs, though going forward we will use the term heterogeneous to refer to graphs with multiple types of nodes. In contrast, graphs with only one node type, as in the previous examples, are referred to as homogeneous graphs.

Schema design

At this point, it is reasonable to ask the question: What features of a dataset should be nodes, edges, and properties?

For any given dataset, there are multiple ways to represent data as a graph, and each is more suited to different purposes. Herein lies the trick to good graph modeling: a question or use case-driven schema design.

If we were particularly interested in the locations of Twitter users in our network, then we could move the location node property on the Twitter user nodes to create the LOCATED_IN relationship type. This is shown in the following diagram:

Figure 1.5 – The same graph but with the location property moved from a node property to a node type

Figure 1.5 – The same graph but with the location property moved from a node property to a node type

If we were particularly interested in the locations of Twitter users in our network, then we could move the location node property on the Twitter user nodes to a separate node type and create the LOCATED_IN relationship type. We could even go one step further to represent the information we know about these locations, adding the country related to each location as a separate, abstracted node.

This graph structure models the same data in a different way, which may be more or less suitable or performant for particular use cases. We will explore the effects of schema design on the types of questions that can be asked, and performance, in later chapters.

In the next section, we will compare how graph data structures differ from traditional RDBs. This will expand on why GDBs can be more performant when modeled as a graph data problem.

Comparing RDBs and GDBs

RDBs have been a standard for data storage and data analysis across most industries for a very long time. Their strength lies in being able to hold multiple tables of different information, where some table fields are found across tables, enabling data linkage.

With this data linkage, complex questions can be asked of data in an RDB. However, there are drawbacks to this relational structure. While RDBs are useful for returning a large number of rows that meet particular criteria, they are not suited to questions involving many chained relationships.

To illustrate this, consider a standard database containing train services and their station stops, alongside a graph that might represent the same information:

Figure 1.6 – Relational data structure of trains and their stops

Figure 1.6 – Relational data structure of trains and their stops

In an RDB structure, it would not be difficult to retrieve all trains that service a particular stop. On the other hand, it may be a slow operation that returns a series of trains that can be taken between two chosen stations.

Consider the steps needed in a traditional RDB to find the route between Truro and Glasgow Central in the preceding table. Starting at Truro, we would know the GW1426 train service stops at Truro, Liskeard, and Plymouth. Knowing that these stations can be reached from Truro, we would then need to find what train services stop at each of these stations to find our route.

Upon finding that Plymouth station is reachable and that a separate service runs to many more stations, we would need to repeat this process over and over until Glasgow Central is reached.

These steps essentially result in a series of computationally costly join operations on the same table, where one resulting row would give us the path between our stations of interest.

GDBs to the rescue

Using a graph structure to represent this train network puts greater emphasis on relationships between our data points, as illustrated in the following diagram:

\

Figure 1.7 – Graph data structure of trains and their stops

Figure 1.7 – Graph data structure of trains and their stops

Using a graph structure to represent this train network puts greater emphasis on relationships between our data points. Starting from Truro station, as in the RDB example, we find the train that services that station. However, when traversing the graph to find a possible route between Truro and Glasgow Central, at each station or train node we are considering fewer data points, and therefore fewer options.

This is in contrast to the RDB example, where repeated table joins are necessary to return a path. In this case, the complexity of the operations required over the graph representation is lower, which equates to a faster, more efficient method. Among many other use cases, those that require some sort of pathfinding often benefit from a graph data model.

In addition to being more suitable for specific types of queries, graphs are typically useful where a flexible, evolving data model is needed. Again, using the example of the train network, imagine that, as the database administrator, you have received a request to add bus transport links to the data model.

With an RDB, a new table would be required, since several bus services would likely serve each train station. In this new table, the names of each station would need to be duplicated from the existing table, to list alongside their related bus services.

Not only does this duplication increase the size of data stored, but it also increases the complexity of the database schema:

Figure 1.8 – Adding a new data type (buses) to the train station graph

Figure 1.8 – Adding a new data type (buses) to the train station graph

Where the train station data is represented with a graph, the new information on buses can be added directly to the existing database as a new node type.

There is no requirement for a new table, and no need to duplicate each station node to represent the required information; the existing train nodes can be directly linked to new Bus nodes. This holds for any new data type that would require the addition of a new table in a traditional RDB.

In a graph, where new data could be represented in an equivalent RDB as a new column in an existing table, this may be a good candidate for a node property, as opposed to a new node type.

Here, an example suitable for being represented as a node property would be a code for each train station, where stations and their codes have a 1-to-1 relationship.

A comparison, in short, is captured in the following:

  • RDBs have a rigid data format and a new table must be added for a new type of data. GDBs are more flexible when it comes to the format of the data and can be extended with new node types.
  • RDBs can be queried via path-based queries – for example, how many steps between two people in a friend network, which involves multiple joins and can be extremely slow as the paths become longer. GDBs query paths directly, with no join operations, so information retrieval is more streamlined and quite frankly faster.

In summary, where the use case for a database concerns querying many relationships between objects – that is, paths – or when a flexible data schema is needed, a graph data model is likely to be a good fit to represent your data.

The use of graphs across various industries

Graph data science is prevalent across a wide array of industries.

The main areas where graphs are being used effectively are as follows:

  • Finance: To look at fraud detection and portfolio risk.
  • Government: To aid with intelligence profiling and supply chain analytics.
  • Life sciences: For looking at patient journeys through a hospital (the transition of a patient through various services), drug response rates, and the transition of infections through a population.
  • Network & IT: Security networks and user access management (nodes on a network represent each user logging into a network).
  • Telecoms: Through network optimization and churn prediction.
  • Marketing: Mainly for customer and market segmentation purposes.
  • Social media analysis: We work for a company that specializes in platform moderation, online harm protection, and brand defense. By creating graphs to defend against attacks on brands, we can find vulnerable people or moderate the most severe type of content.

In terms of graphs in industry, they are pervasive due to the reasons we have already explored in this chapter. The ability to quickly link nodes and edges, and create relationships between them, is the reason why more problems in data science are being modeled as graphs or network science problems. In addition, the underlying data can be queried at a rapid rate. This can be done instead of using traditional database solutions, which, as we have already identified, are slow to query compared to GDBs.

Following this, in the next section, we will introduce the main two driving packages for graph analytics and modeling. We will show you the basic usage of the packages. In the subsequent chapters, we will keep building on why these packages are powerful.

Introduction to NetworkX and igraph

In this chapter, we will introduce two Python packages for creating in-memory graphs: NetworkX and igraph.

NetworkX lets you create graphs, perform graph manipulation, study and visualize their structures, and perform several graph manipulation functions when working with graphs. Their website (https://networkx.org/) contains details of the major changes to the package and the intended usage of the tool.

igraph contains a suite of useful and practical analysis tools, with the aim being to make these efficient and easy to use, in a reproducible way. What is great about igraph is that it is open source and free, plus it supports networks to be built in R, Python, Mathematica, and C/C++. This is our recommended package for creating large networks that can load much more quickly than NetworkX. To read more about igraph, go to https://igraph.org/.

In the following subsections, we will look at the basics of both NetworkX and igraph, with easy-to-follow coding steps. This is the first time you are going to get your hands dirty with graph data modeling.

NetworkX basics

NetworkX is one of the originally available graph libraries for Python and is particularly focused on being user-friendly and Pythonic in its style. It also natively includes methods for calculating some classic network analysis measures:

  1. To import NetworkX into Python, use the following command:
    import networkx as nx
  2. And to create an empty graph, g, use the following command:
    g = nx.Graph()
  3. Now, we need to add nodes to our graph, which can be done using methods of the Graph object belonging to g. There are multiple ways to do this, with the simplest being adding one node at a time:
    g.add_node(Jeremy)
  4. Alternatively, multiple nodes can be added to the graph at once, like so:
    g.add_nodes_from([Mark, Jeremy])
  5. Properties can be added to nodes during creation by passing a node and dictionary tuple to Graph.add_nodes_from:
    g.add_nodes_from([(Mark, {followers: 2100}), (Jeremy, {followers: 130})])
  6. To add an edge to the graph, we can use the Graph.add_edge method, and reference the nodes already present in the graph:
    g.add_edge(Jeremy, Mark)

It is worth noting that, in NetworkX, when adding an edge, any nodes specified as part of that edge not already in the graph will be added implicitly.

  1. To confirm that our graph now contains nodes and edges, we may want to plot it, using matplotlib and networkx.draw(). The with_labels parameter adds the names of the nodes to the plot:
    import matplotlib.pyplot as plt
    nx.draw(g, with_labels=True)
    plt.show()

This section showed you how you can get up and running with NetworkX in a couple of lines of Python code. In the next section, we will turn our focus to the popular igraph package, which allows us to perform calculations over larger datasets much quicker than using the popular NetworkX.

igraph basics

NetworkX, while user-friendly, suffers from slow speeds when using larger graphs. This is due to its implementation behind the scenes and because it is written in Python, with some C, C++, and FORTRAN.

In contrast, igraph is implemented in pure C, giving the library an advantage when working with large graphs and complex network algorithms. While not as immediately accessible as NetworkX for beginners, igraph is a useful tool to have under your belt when code efficiency is paramount.

Initially, working with igraph is very similar to working with NetworkX. Let’s take a look:

  1. To import igraph into Python, use the following command:
    import igraph as ig
  2. And to create an empty graph, g, use the following command:
    g = ig.Graph()

In contrast to NetworkX, in igraph, all nodes have a prescribed internal integer ID. The first node that’s added has an ID of 0, with all subsequent nodes assigned increasing integer IDs.

  1. Similar to NetworkX, changes can be made to a graph by using the methods of a Graph object. Nodes can be added to the graph with the Graph.add_vertices method (note that a vertex is another way to refer to a node). Two nodes can be added to the graph with the following code:
    g.add_vertices(2)
  2. This will add nodes 0 and 1 to the graph. To name them, we have to assign properties to the nodes. We can do this by accessing the vertices of the Graph object. Similar to how you would access elements of a list, each node’s properties can be accessed by using the following notation. Here, we are setting the name and followers attributes of nodes 0 and 1:
    g.vs[0][name] = Jeremy
    g.vs[1][name] = Mark
    g.vs[0][followers] = 130
    g.vs[1][followers] = 2100
  3. Node properties can also be added listwise, where the first list element corresponds to node ID 0, the second to node ID 1, and so on. The following two lines are equivalent to the four lines shown in step 4:
    g.vs["name"] = [Jeremy, Mark]
    g.vs[followers] = [130, 2100]
  4. To add an edge, we can use the Graph.add_edges() method:
    g.add_edges([(0, 1)])

Here, we are only adding one edge, but additional edges can be added to the list parameter required by add_edges. As with NetworkX, if edges are added for nodes that are not currently in the graph, nodes will be created implicitly. However, since igraph requires nodes to have sequential IDs, attempting to add the edge pair (1, 3) to a graph with two vertices will fail.

Summary

In this chapter, we looked at why you should start to think graph, from the benefits of why these methods are becoming the most widely utilized and discussed approaches in various industries. We looked at what a graph is and explained the various types of graphs, such as graphs that are concerned with how things move through a network, to influence graphs (who is influencing who on social media), to graph methods to identify groups and interactions, and how graphs can be utilized to detect patterns in a network.

Moving on from that, we examined the fundamentals of what makes up a graph. Here, we looked at the fundamental elements of nodes, edges, and properties and delved into the difference between an undirected and directed graph. Additionally, we examined the properties of nodes, looked at heterogeneous graphs, and examined the types of schema contained within a graph.

This led to how GDBs compare to legacy RDBs and why, in many cases, graphs are much easier and faster to transverse and query, with examples of how graphs can be utilized to optimize the stops on a train journey and how this can be extended, with ease, to add bus stops as well, as a new data source.

Following this, we looked at how graphs are being deployed across various industries and some use cases for why graphs are important in those industries, such as fraud detection in the finance sector, intelligence profiling in the government sector, patient journeys in hospitals, churn across networks, and customer segmentation in marketing. Graphs truly are becoming ubiquitous across various industries.

We wrapped up this chapter by providing an introduction to the powerhouses of graph analytics and network analysis – igraph and NetworkX. We showed you how, in a few lines of Python code, you can easily start to populate a graph.

In the next chapter, we will look at how to work with and create graph data models. The next chapter will contain many more hands-on examples of how to structure your data using graph data models in Python.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Transform relational data models into graph data model while learning key applications along the way
  • Discover common challenges in graph modeling and analysis, and learn how to overcome them
  • Practice real-world use cases of community detection, knowledge graph, and recommendation network

Description

Graphs have become increasingly integral to powering the products and services we use in our daily lives, driving social media, online shopping recommendations, and even fraud detection. With this book, you’ll see how a good graph data model can help enhance efficiency and unlock hidden insights through complex network analysis. Graph Data Modeling in Python will guide you through designing, implementing, and harnessing a variety of graph data models using the popular open source Python libraries NetworkX and igraph. Following practical use cases and examples, you’ll find out how to design optimal graph models capable of supporting a wide range of queries and features. Moreover, you’ll seamlessly transition from traditional relational databases and tabular data to the dynamic world of graph data structures that allow powerful, path-based analyses. As well as learning how to manage a persistent graph database using Neo4j, you’ll also get to grips with adapting your network model to evolving data requirements. By the end of this book, you’ll be able to transform tabular data into powerful graph data models. In essence, you’ll build your knowledge from beginner to advanced-level practitioner in no time.

Who is this book for?

If you are a data analyst or database developer interested in learning graph databases and how to curate and extract data from them, this is the book for you. It is also beneficial for data scientists and Python developers looking to get started with graph data modeling. Although knowledge of Python is assumed, no prior experience in graph data modeling theory and techniques is required.

What you will learn

  • Design graph data models and master schema design best practices
  • Work with the NetworkX and igraph frameworks in Python Store, query, ingest, and refactor graph data
  • Store your graphs in memory with Neo4j
  • Build and work with projections and put them into practice
  • Refactor schemas and learn tactics for managing an evolved graph data model
Estimated delivery fee Deliver to United States

Economy delivery 10 - 13 business days

Free $6.95

Premium delivery 6 - 9 business days

$21.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 30, 2023
Length: 236 pages
Edition : 1st
Language : English
ISBN-13 : 9781804618035
Category :
Languages :
Concepts :
Tools :

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Estimated delivery fee Deliver to United States

Economy delivery 10 - 13 business days

Free $6.95

Premium delivery 6 - 9 business days

$21.95
(Includes tracking information)

Product Details

Publication date : Jun 30, 2023
Length: 236 pages
Edition : 1st
Language : English
ISBN-13 : 9781804618035
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 134.97
Hands-On Graph Neural Networks Using Python
$49.99
Graph Data Modeling in Python
$44.99
Causal Inference and Discovery in Python
$39.99
Total $ 134.97 Stars icon
Banner background image

Table of Contents

15 Chapters
Part 1: Getting Started with Graph Data Modeling Chevron down icon Chevron up icon
Chapter 1: Introducing Graphs in the Real World Chevron down icon Chevron up icon
Chapter 2: Working with Graph Data Models Chevron down icon Chevron up icon
Part 2: Making the Graph Transition Chevron down icon Chevron up icon
Chapter 3: Data Model Transformation – Relational to Graph Databases Chevron down icon Chevron up icon
Chapter 4: Building a Knowledge Graph Chevron down icon Chevron up icon
Part 3: Storing and Productionizing Graphs Chevron down icon Chevron up icon
Chapter 5: Working with Graph Databases Chevron down icon Chevron up icon
Chapter 6: Pipeline Development Chevron down icon Chevron up icon
Chapter 7: Refactoring and Evolving Schemas Chevron down icon Chevron up icon
Part 4: Graphing Like a Pro Chevron down icon Chevron up icon
Chapter 8: Perfect Projections Chevron down icon Chevron up icon
Chapter 9: Common Errors and Debugging Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8
(6 Ratings)
5 star 83.3%
4 star 16.7%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Jacob Silva Mar 07, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great book! It quickly teaches a beginner about graph data with easy-to-understand content, hands-on exercises, and provided source code. Then, it touches on advanced topics like projections on graph data. I am happy we picked this book for our Tech Book Club
Amazon Verified review Amazon
Amanda Fetch Sep 26, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
As someone who is currently working with a knowledge graph based solution within an organization, it is imperative that I find solid resources to cover both the basics and more advanced concepts of graph data modeling. "Graph Data Modeling in Python" starts with the basics of graph structures and their real-world applications, then the book systematically covers practical use-cases from making recommendations using Python to designing complex pipelines with Neo4j and Cypher. Its in-depth coverage on transforming relational databases to graph databases, building knowledge graphs, and addressing common errors I found very useful. A perfect blend of theory and hands-on examples makes it an indispensable resource for anyone venturing into the world of graph databases. Highly recommended for both beginners and seasoned professionals!
Amazon Verified review Amazon
Jacob Vandergriff Sep 28, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Graph Data Modeling in Python provides a great introduction for those familiar with python and relational databases.The flow from relational databases, to python graph modeling, to scaling up to Neo4j databases was exceptional and can get any beginner to a graph developer.
Amazon Verified review Amazon
Om S Jul 05, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book begins by highlighting the growing importance of graphs in our daily lives, driving various applications such as social media, recommendation systems, and fraud detection. Readers are introduced to the benefits of graph data models, which enhance efficiency and uncover hidden insights through complex network analysis.With a focus on practical use cases, readers learn how to design optimal graph models capable of supporting a wide range of queries and features. The transition from traditional relational databases to dynamic graph data structures is seamlessly addressed, empowering readers to unlock the power of path-based analyses. The book also covers working with Neo4j for persistent graph database management.By the end of the book, readers will possess the skills to transform tabular data into powerful graph data models, ranging from beginner to advanced-level proficiency. They will have a deep understanding of schema design best practices, proficiency in NetworkX and igraph frameworks, and the ability to store, query, ingest, and refactor graph data."Graph Data Modeling in Python: A Practical Guide to Curating, Analyzing, and Modeling Data with Graphs" is a must-read for those interested in graph databases and data modeling. The book caters to individuals with prior knowledge of Python, while also being accessible to beginners in graph data modeling. Its practical approach, real-world examples, and coverage of common errors and debugging make it an essential companion for those looking to extract valuable insights from graph data.
Amazon Verified review Amazon
SEAN W. GRANT Aug 18, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This was a good book to read that helped in understanding how to build a graph from various sources (rdbms tables, NLP outputs). It showed how you can use some common python libraries (NLTK, Spacy) to process some data and create a graph from it. There were a couple chapters that I especially liked: Chapter 4 and 7.Chapter 4- Building Knowledge Graphs: Author does a great job of explaining what a knowledge graph is, how to design one and then walk through how to build one. Knowledge graphs are becoming a popular topic and structure that businesses are using to provide better understanding and this chapter will help you understand how you can start looking at your data to build a knowledge graph.Chapter 7-Refactoring and Evolving Schemas: I found this to be a great explanation of power of the graph database and how it can adjust fast to your needs without causing a problem with live data. The hands-on examples were great at showing how easy it is to change the schema in the Neo4j database.Overall this was a great book for getting your hands dirty on designing, understand and building graphs that will help you think about how connected data really is.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela