Graph algorithms in GraphX
GraphX supports sophisticated Graph processing and while you can build your own graph algorithms, GraphX provides a number of algorithms as a part of GraphX directly available as methods of graph or GraphOps
objects. The three major components that GraphX supports include:
- PageRank
- Connected components
- Triangle counting
PageRank
PageRank measure the importance of a vertex in the graph. For example, a Twitter user with lots of Twitter followers, or a LinkedIn user with lots of LinkedIn connections and followers is often considered influential and ranked highly.
GraphX supports both static and dynamic versions of PageRank, where static PageRank runs for a fixed number of iterations, while a dynamic PageRank runs until convergence (changes less than the tolerance level after each iteration).
Let's look at a very simple example. We have a set of call detail records (dummy data), and we are trying to identify the most important person in the community. There are other metrics...