In the previous chapters, we used short recipes and extremely simplified code to demonstrate basic building blocks and concepts governing the Spark machine library. In this chapter, we present a more developed application that addresses specific machine learning library domains using Spark's API and facilities. The number of recipes is less in this chapter; however, we get into a more ML application setting.
In this chapter, we explore the recommendation system and its implementation using a matrix factorization technique that draws on latent factor models called alternating least square (ALS). In a nutshell, when we try to factorize a large matrix of user-item ratings into two lower ranked, skinnier matrices, we often face a non-linear or non-convex optimization problem that is very difficult to solve. It happens that we are very good at solving convex optimization problems by fixing one leg and partially solving the other and then going back and forth (hence alternating); we...