Exploring distributed BlockMatrix in Spark 2.0
In this recipe, we explore BlockMatrix
, which is a nice and a placeholder for the block of other matrices. In short, it is a matrix of other matrices (matrix blocks) which can be accessed as a cell.
We take a look at a simplified code snippet by converting the CoordinateMatrix
to a BlockMatrix
and then do a quick check for its validity and access one of its properties to show that it was set up properly. BlockMatrix code takes longer to set up and it needs a real life application (not enough space) to demonstrate and show its properties in action.
How to do it...
- Start a new project in IntelliJ or in an editor of your choice and make sure all the necessary JAR files (Scala and Spark) are available to your application.
- Import the necessary packages for vector and matrix manipulation:
import org.apache.spark.mllib.linalg.distributed.RowMatrix import org.apache.spark.mllib.linalg.distributed.{IndexedRow, IndexedRowMatrix} import org.apache.spark...