In this chapter, we will cover:
- Building a KMeans classification system in Spark 2.0
- Bisecting KMeans, the new kid on the block in Spark 2.0
- Using Gaussian Mixture and Expectation Maximization (EM) in Spark 2.0 to classify data
- Classifying the vertices of a graph using Power Iteration Clustering (PIC) in Spark 2.0
- Using Latent Dirichlet Allocation (LDA) to classify documents and text into topics
- Streaming KMeans to classify data in near real time