In this section, you will do data analysis on distributed data, and get introduced to Spark, a Scala-based distributed framework. This section will cover some interesting machine learning (ML) concepts such as decision trees, random forests, lasso regression, and k-means clustering.
This section will contain the following chapters:
- Chapter 6, Introduction to Spark for Distributed Data Analysis
- Chapter 7, Traditional Machine Learning for Data Analysis