In this chapter, we introduced Apache Spark and its architecture. We discussed the concept of driver program and executors, which are the core components of Spark.
We then briefly discussed the different programming APIs for Spark, and its major components including Spark Core, Spark SQL, Spark Streaming, and Spark GraphX.
Finally, we discussed some major differences between Spark and Hadoop and how they complement each other. In the next chapter, we will install Spark on an AWS EC2 instance and go through different clients to interact with Spark.