Massive graphs on big data
Big data comprises a huge amount of data distributed across a cluster of thousands (if not more) of machines. Building graphs based on this massive data has different challenges. Due to the vast amount of data involved, the data for the graph is distributed across a cluster of machines. Hence, in actuality, it's not a single node graph, and we have to build a graph that spans across a cluster of machines. A graph that spans across a cluster of machines would have vertices and edges spread across different machines, and this data in a graph won't fit into the memory of one single machine. Consider your friend's list on Facebook; some of your friend's data in your Facebook friend list graph might lie on different machines, and this data may just be tremendous in size. Look at an example diagram of a graph of 10 Facebook friends and their network, shown as follows:
As you can see in the preceding diagram, for just 10 friends the data can be huge...