What this book covers
Chapter 1, Getting Started with Apache Spark, explains how to install Spark on various environments and cluster managers.
Chapter 2, Developing Applications with Spark, talks about developing Spark applications on different IDEs and using different build tools.
Chapter 3, External Data Sources, covers how to read and write to various data sources.
Chapter 4, Spark SQL, takes you through the Spark SQL module that helps you to access the Spark functionality using the SQL interface.
Chapter 5, Spark Streaming, explores the Spark Streaming library to analyze data from real-time data sources, such as Kafka.
Chapter 6, Getting Started with Machine Learning Using MLlib, covers an introduction to machine learning and basic artifacts such as vectors and matrices.
Chapter 7, Supervised Learning with MLlib – Regression, walks through supervised learning when the outcome variable is continuous.
Chapter 8, Supervised Learning with MLlib – Classification, discusses supervised learning when the outcome variable is discrete.
Chapter 9, Unsupervised Learning with MLlib, covers unsupervised learning algorithms such as k-means.
Chapter 10, Recommender Systems, introduces building recommender systems using various techniques, such as ALS.
Chapter 11, Graph Processing Using GraphX, talks about various graph processing algorithms using GraphX.
Chapter 12, Optimizations and Performance Tuning, covers various optimizations on Apache Spark and performance tuning techniques.