Chapter 1, Getting Started with Apache Spark, explains how to install Spark on various environments and cluster managers.
Chapter 2, Developing Applications with Spark, talks about developing Spark applications on different IDEs and using different build tools.Â
Chapter 3, Spark SQL, covers how to read and write to various data sources.
Chapter 4, Working with External Data Sources, takes you through the Spark SQL module that helps you access the Spark functionality using the SQL interface.
Chapter 5, Spark Streaming, explores the Spark Streaming library to analyze data from
real-time data sources, such as Kafka.
Chapter 6, Getting Started with Machine Learning, covers an introduction to machine learning and basic artifacts, such as vectors and matrices.
Chapter 7, Supervised Learning with MLlib – Regression, walks through supervised learning when the outcome variable is continuous.
Chapter 8, Supervised Learning with MLlib – Classification, discusses supervised learning when the outcome variable is discrete.
Chapter 9, Unsupervised Learning, covers unsupervised learning algorithms, such as k-means.
Chapter 10, Recommendations Using Collaborative Filtering, introduces building recommender systems using various techniques, such as ALS.
Chapter 11, Graph Processing Using GraphX and GraphFrames, talks about various graph processing algorithms using GraphX.
Chapter 12, Optimizations and Performance Tuning, covers various optimizations on Apache Spark and performance tuning techniques.