Introduction
Linear algebra is the cornerstone of machine learning (ML) and mathematicalprogramming (MP). When dealing with Spark's machine library, one must understand that the Vector/Matrix structures by Scala (imported by default) are different from the Spark ML, MLlib Vector, Matrix facilities provided by Spark. The latter, powered by RDDs, is the desired data structure if you are going to use Spark (that is, parallelism) out of the box for large-scale matrix/vector computation (for example, SVD implementation alternatives with more numerical accuracy, desired in some cases for derivatives pricing and risk analytics). The Scala Vector/Matrix libraries provide a rich set of linear algebra operations such as dot product, additions, and so on, that still have their own place in an ML pipeline. In summary, the key difference between using Scala Breeze and Spark or Spark ML is that the Spark facility is backed by RDDs which allows for simultaneous distributed, concurrent computing, and resiliency...