Using sparse local matrices with Spark 2.0
In this recipe, we concentrate on creation. In the recipe, we saw how a local dense matrix is declared and stored. A good number of machine learning problem domains can be represented as a set of features and labels within the matrix. In large-scale machine learning problems (for example, progression of a disease through large population centers, security fraud, political movement modeling, and so on), a good portion of the cells will be 0 or null (for example, the current number of people with a given disease versus the healthy population).
To help with storage and efficient operation in real time, sparse local matrices specialize in storing the cells efficiently as a list plus an index, which leads to faster loading and real time operations.
How to do it...
- Start a new project in IntelliJ or in an IDE of your choice. Make sure that the necessary JAR files are included.
- Import the necessary packages for vector and matrix manipulation:Â
import org...