Summary
As you can see in this chapter, you can easily perform a lot of powerful data analysis by executing queries against graph structures. With GraphFrames, you can leverage the power, simplicity, and performance of the DataFrame API against your graph problems.
For more information on GraphFrames, please refer to the following resources:
- Introducing GraphFrames (http://bit.ly/2dBPhKn)
- On-Time Flight Performance with GraphFrames for Apache Spark (http://bit.ly/2c804ZD)
- On-Time Flight Performance with GraphFrames for Apache Spark (Spark 2.0) Notebook (http://bit.ly/2kPkXkc)
- GraphFrames Overview (http://graphframes.github.io/)
- Pygraphframes documentation (http://graphframes.github.io/api/python/graphframes.html)
- GraphX Programming Guide (http://spark.apache.org/docs/latest/graphx-programming-guide.html)
In the next chapter, we will expand our PySpark horizon into the area of Deep Learning with the focus on TensorFlow and TensorFrames.