Getting Breeze – the linear algebra library
In simple terms, Breeze (http://www.scalanlp.org) is a Scala library that extends the Scala collection library to provide support for vectors and matrices in addition to providing a whole bunch of functions that support their manipulation. We could safely compare Breeze to NumPy (http://www.numpy.org/) in Python terms. Breeze forms the foundation of MLlib—the Machine Learning library in Spark, which we will explore in later chapters.
In this first recipe, we will see how to pull the Breeze libraries into our project using Scala Build Tool (SBT). We will also see a brief history of Breeze to better appreciate why it could be considered as the "go to" linear algebra library in Scala.
Note
For all our recipes, we will be using Scala 2.10.4 along with Java 1.7. I wrote the examples using the Scala IDE, but please feel free to use your favorite IDE.
How to do it...
Let's add the Breeze dependencies into our build.sbt
so that we can start playing with them in the subsequent recipes. The Breeze dependencies are just two—the breeze
(core) and the breeze-native
dependencies.
- Under a brand new folder (which will be our project root), create a new file called
build.sbt
. - Next, add the
breeze
libraries to the project dependencies:organization := "com.packt" name := "chapter1-breeze" scalaVersion := "2.10.4" libraryDependencies ++= Seq( "org.scalanlp" %% "breeze" % "0.11.2", //Optional - the 'why' is explained in the How it works section "org.scalanlp" %% "breeze-natives" % "0.11.2" )
- From that folder, issue a
sbt compile
command in order to fetch all your dependencies.Note
You could import the project into your Eclipse using
sbt eclipse
after installing thesbteclipse
plugin https://github.com/typesafehub/sbteclipse/. For IntelliJ IDEA, you just need to import the project by pointing to the root folder where yourbuild.sbt
file is.
There's more...
Let's look into the details of what the breeze
and breeze-native
library dependencies we added bring to us.
The org.scalanlp.breeze dependency
Breeze has a long history in that it isn't written from scratch in Scala. Without the native dependency, Breeze leverages the power of netlib-java
that has a Java-compiled version of the FORTRAN Reference implementation of BLAS/LAPACK. The netlib-java
also provides gentle wrappers over the Java compiled library. What this means is that we could still work without the native dependency but the performance won't be great considering the best performance that we could leverage out of this FORTRAN-translated library is the performance of the FORTRAN reference implementation itself. However, for serious number crunching with the best performance, we should add the breeze-natives
dependency too.
The org.scalanlp.breeze-natives package
With its native additive, Breeze looks for the machine-specific implementations of the BLAS/LAPACK libraries. The good news is that there are open source and (vendor provided) commercial implementations for most popular processors and GPUs. The most popular open source implementations include ATLAS (http://math-atlas.sourceforge.net) and OpenBLAS (http://www.openblas.net/).
If you are running a Mac, you are in luck—Native BLAS libraries come out of the box on Macs. Installing NativeBLAS on Ubuntu / Debian involves just running the following commands:
sudo apt-get install libatlas3-base libopenblas-base sudo update-alternatives --config libblas.so.3 sudo update-alternatives --config liblapack.so.3
Tip
Downloading the example code
You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
For Windows, please refer to the installation instructions on https://github.com/xianyi/OpenBLAS/wiki/Installation-Guide.