Predicting Future Edges
Link prediction (LP) is a key topic in Graph Data Science (GDS), since it is a problem very specific to graphs. While we can do classification for many kinds of datasets, not only graphs, LP can only be performed if we have links, meaning if our data is a graph. But the applications of these problems are quite wide: from understanding the dynamics of social network to product recommendations to criminal network analysis.
This chapter is going to give you a short introduction to the LP problem. We will define what observations are and how to build the initial dataset. We will also talk about the metrics that can be used to infer the presence of a hidden or future link and compute them using the GDS library. Finally, we will use a GDS pipeline to build a simple link prediction model, fit it on data stored in Neo4j, and make predictions.
In this chapter, we’re going to cover the following main topics:
- Introducing the LP problem
- LP features...