Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Graph Data Modeling in Python

You're reading from   Graph Data Modeling in Python A practical guide to curating, analyzing, and modeling data with graphs

Arrow left icon
Product type Paperback
Published in Jun 2023
Publisher Packt
ISBN-13 9781804618035
Length 236 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Gary Hutson Gary Hutson
Author Profile Icon Gary Hutson
Gary Hutson
Matt Jackson Matt Jackson
Author Profile Icon Matt Jackson
Matt Jackson
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Part 1: Getting Started with Graph Data Modeling
2. Chapter 1: Introducing Graphs in the Real World FREE CHAPTER 3. Chapter 2: Working with Graph Data Models 4. Part 2: Making the Graph Transition
5. Chapter 3: Data Model Transformation – Relational to Graph Databases 6. Chapter 4: Building a Knowledge Graph 7. Part 3: Storing and Productionizing Graphs
8. Chapter 5: Working with Graph Databases 9. Chapter 6: Pipeline Development 10. Chapter 7: Refactoring and Evolving Schemas 11. Part 4: Graphing Like a Pro
12. Chapter 8: Perfect Projections 13. Chapter 9: Common Errors and Debugging 14. Index 15. Other Books You May Enjoy

Introduction to NetworkX and igraph

In this chapter, we will introduce two Python packages for creating in-memory graphs: NetworkX and igraph.

NetworkX lets you create graphs, perform graph manipulation, study and visualize their structures, and perform several graph manipulation functions when working with graphs. Their website (https://networkx.org/) contains details of the major changes to the package and the intended usage of the tool.

igraph contains a suite of useful and practical analysis tools, with the aim being to make these efficient and easy to use, in a reproducible way. What is great about igraph is that it is open source and free, plus it supports networks to be built in R, Python, Mathematica, and C/C++. This is our recommended package for creating large networks that can load much more quickly than NetworkX. To read more about igraph, go to https://igraph.org/.

In the following subsections, we will look at the basics of both NetworkX and igraph, with easy-to-follow coding steps. This is the first time you are going to get your hands dirty with graph data modeling.

NetworkX basics

NetworkX is one of the originally available graph libraries for Python and is particularly focused on being user-friendly and Pythonic in its style. It also natively includes methods for calculating some classic network analysis measures:

  1. To import NetworkX into Python, use the following command:
    import networkx as nx
  2. And to create an empty graph, g, use the following command:
    g = nx.Graph()
  3. Now, we need to add nodes to our graph, which can be done using methods of the Graph object belonging to g. There are multiple ways to do this, with the simplest being adding one node at a time:
    g.add_node(Jeremy)
  4. Alternatively, multiple nodes can be added to the graph at once, like so:
    g.add_nodes_from([Mark, Jeremy])
  5. Properties can be added to nodes during creation by passing a node and dictionary tuple to Graph.add_nodes_from:
    g.add_nodes_from([(Mark, {followers: 2100}), (Jeremy, {followers: 130})])
  6. To add an edge to the graph, we can use the Graph.add_edge method, and reference the nodes already present in the graph:
    g.add_edge(Jeremy, Mark)

It is worth noting that, in NetworkX, when adding an edge, any nodes specified as part of that edge not already in the graph will be added implicitly.

  1. To confirm that our graph now contains nodes and edges, we may want to plot it, using matplotlib and networkx.draw(). The with_labels parameter adds the names of the nodes to the plot:
    import matplotlib.pyplot as plt
    nx.draw(g, with_labels=True)
    plt.show()

This section showed you how you can get up and running with NetworkX in a couple of lines of Python code. In the next section, we will turn our focus to the popular igraph package, which allows us to perform calculations over larger datasets much quicker than using the popular NetworkX.

igraph basics

NetworkX, while user-friendly, suffers from slow speeds when using larger graphs. This is due to its implementation behind the scenes and because it is written in Python, with some C, C++, and FORTRAN.

In contrast, igraph is implemented in pure C, giving the library an advantage when working with large graphs and complex network algorithms. While not as immediately accessible as NetworkX for beginners, igraph is a useful tool to have under your belt when code efficiency is paramount.

Initially, working with igraph is very similar to working with NetworkX. Let’s take a look:

  1. To import igraph into Python, use the following command:
    import igraph as ig
  2. And to create an empty graph, g, use the following command:
    g = ig.Graph()

In contrast to NetworkX, in igraph, all nodes have a prescribed internal integer ID. The first node that’s added has an ID of 0, with all subsequent nodes assigned increasing integer IDs.

  1. Similar to NetworkX, changes can be made to a graph by using the methods of a Graph object. Nodes can be added to the graph with the Graph.add_vertices method (note that a vertex is another way to refer to a node). Two nodes can be added to the graph with the following code:
    g.add_vertices(2)
  2. This will add nodes 0 and 1 to the graph. To name them, we have to assign properties to the nodes. We can do this by accessing the vertices of the Graph object. Similar to how you would access elements of a list, each node’s properties can be accessed by using the following notation. Here, we are setting the name and followers attributes of nodes 0 and 1:
    g.vs[0][name] = Jeremy
    g.vs[1][name] = Mark
    g.vs[0][followers] = 130
    g.vs[1][followers] = 2100
  3. Node properties can also be added listwise, where the first list element corresponds to node ID 0, the second to node ID 1, and so on. The following two lines are equivalent to the four lines shown in step 4:
    g.vs["name"] = [Jeremy, Mark]
    g.vs[followers] = [130, 2100]
  4. To add an edge, we can use the Graph.add_edges() method:
    g.add_edges([(0, 1)])

Here, we are only adding one edge, but additional edges can be added to the list parameter required by add_edges. As with NetworkX, if edges are added for nodes that are not currently in the graph, nodes will be created implicitly. However, since igraph requires nodes to have sequential IDs, attempting to add the edge pair (1, 3) to a graph with two vertices will fail.

You have been reading a chapter from
Graph Data Modeling in Python
Published in: Jun 2023
Publisher: Packt
ISBN-13: 9781804618035
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime