Summary
In this chapter, you learned about the concept of ML and the different types of ML algorithms. You also learned about some of the real-world applications of ML to help businesses minimize losses and maximize revenues and accelerate their time to market. You were introduced to the necessity of scalable ML and two different techniques for scaling out ML algorithms. Apache Spark's native ML Library, MLlib, was introduced, along with its major components.
Finally, you learned a few techniques to perform data wrangling to clean, manipulate, and transform data to make it more suitable for the data science process. In the following chapter, you will learn about the send phase of the ML process, called feature extraction and feature engineering, where you will learn to apply various scalable algorithms to transform individual data fields to make them even more suitable for data science applications.