Graph algorithms
The Spark Graphx
library provides built-in implementations for some very popular graph algorithms. These implementations help to perform various graph-based analytics in a simplified manner. The org.apache.Spark.graphx.GraphOps
API allows for executing these operations on the graph. In this section, we will run graph-based analytics using the following implementations:
PageRank
PageRank is one of the most popular algorithms in graph theory. It is used to rank the vertices based on their importance. The importance of a vertex is calculated by the number of edges directed to the vertex. For example, a user is highly ranked on Twitter based on their followers, that is, the number of directed edges to that user vertex.
The PageRank algorithm was developed by Google founders Larry Page and Sergey Brin to measure the importance of web pages. Thus, the best example of PageRank implementation is the Google Search Engine. Google ranks pages based on their importance. For example, if...