Using the Girvan-Newman algorithm
At the beginning of this chapter, we noticed that the Les Miserables network consisted of a single large connected component and that there were no isolates or smaller “islands” of communities apart from the large connected component. To show how connected components could be useful for identifying communities, we shattered the network by removing a few key nodes.
That approach is not typically ideal. While there is information in both nodes (people, places, things) and edges (relationships), in my experience, it is typically preferable to throw away edges than to throw away nodes.
A better approach than what we did previously would be to identify the least number of edges that could be cut that would result in a split network. We could do this by looking for the edges that the greatest number of shortest paths pass through – that is, the edges with the highest edge_betweenness_centrality
.
That is precisely what the Girvan...