In this chapter, we have seen how to put large-scale graph analytics in practice using Spark GraphX. Modeling entity relationships as graphs with vertices and edges is a powerful paradigm to assess many interesting problems.
In GraphX, graphs are finite, directed property graphs, potentially with multiple edges and loops. GraphX does graph analytics on highly optimized versions of vertex and edge RDDs, which allows you to leverage both data and graph-parallel applications. We have seen how such graphs can be read by either loading them from edgeListFile or constructing them individually from other RDDs. On top of that, we have seen how easy it is to create both random and deterministic graph data for quick experiments. Using just the rich built-in functionality of the Graph model, we have shown how to investigate a graph for core properties. To visualize more complex graphs...