Determining airport ranking using PageRank
Because GraphFrames is built on top of GraphX, there are several algorithms that we can immediately leverage. PageRank was popularized by the Google Search Engine and created by Larry Page. To quote Wikipedia:
"PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites."
While the preceding example refers to web pages, this concept readily applies to any graph structure whether it is created from web pages, bike stations, or airports. Yet the interface via GraphFrames is as simple as calling a method. GraphFrames.PageRank
will return the PageRank results as a new column appended to the vertices DataFrame to simplify our downstream analysis.
As there are many flights and connections through the various airports included in this dataset, we can use the PageRank algorithm...