Creating dense matrix and setup with Spark 2.0
In this recipe, we explore creation examples that you most likely would need in your Scala programming and while reading the source code for many of the open source libraries for machine learning.
Spark provides two distinct types of local matrix facilities (dense and sparse) for storage and manipulation of data at a local level. For simplicity, one way to think of a is to visualize it as columns of Vectors.
Getting ready
The key to remember here is that the recipe covers local matrices stored on one machine. We will use another recipe, Distributed matrices in the Spark2.0 ML library, covered in this chapter, for storing and manipulating distributed matrices.
How to do it...
- Start a new project in IntelliJ or in an IDE of your choice. Make sure that the necessary JAR files are included.
- Import the necessary packages for vector and matrix manipulation:
import org.apache.spark.sql.{SparkSession} import org.apache.spark.mllib.linalg._ import breeze...