Packt+ | Advance your knowledge in tech

You're reading from Apache Spark 2.x Cookbook Over 70 cloud-ready recipes for distributed Big Data processing and analytics

Product type Paperback

Published in May 2017

Publisher

ISBN-13 9781787127265

Length 294 pages

Edition 1st Edition

Languages

Scala

Tools

Apache Spark

Concepts

Big Data

Author (1):

Rishi Yadav

View More author details

Mesosphere provides a binary distribution of Mesos. The most recent package of the Mesos distribution can be installed from the Mesosphere repositories by performing the following steps:

Execute Mesos on a Ubuntu OS with the trusty version:

        $ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF 
          DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]') CODENAME=$(lsb_release 
            -cs)
$ sudo vi /etc/apt/sources.list.d/mesosphere.list
deb http://repos.mesosphere.io/Ubuntu trusty main

Update the repositories:

        $ sudo apt-get -y update

Install Mesos:

        $ sudo apt-get -y install mesos

To connect Spark to Mesos and to integrate Spark with Mesos, make Spark binaries available to Mesos and configure the Spark driver to connect to Mesos.
Use the Spark binaries from the first recipe and upload them to HDFS:

        $ hdfs dfs -put spark-2.1.0-bin-hadoop2.7.tgz spark-2.1.0-bin-hadoop2.7.tgz

The master URL of a single master Mesos is mesos://host:5050; the master URL of a ZooKeeper-managed Mesos cluster is mesos://zk://host:2181.
Set the following variables in spark-env.sh:

        $ sudo vi spark-env.sh
export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so
export SPARK_EXECUTOR_URI= hdfs://localhost:9000/user/hduser/spark-2.1.0-bin-
          hadoop2.7.tgz

Run the following commands from the Scala program:

        Val conf = new SparkConf().setMaster("mesos://host:5050")
Val sparkContext = new SparkContext(conf)

Run the following command from the Spark shell:

        $ spark-shell --master mesos://host:5050

Mesos has two run modes:

Fine-grained: In the fine-grained (default) mode, every Spark task runs as a separate Mesos task.

Coarse-grained: This mode will launch only one long-running Spark task on each Mesos machine

To run in the coarse-grained mode, set the spark.mesos.coarse property:

        Conf.set("spark.mesos.coarse","true")

You're reading from Apache Spark 2.x Cookbook Over 70 cloud-ready recipes for distributed Big Data processing and analytics

Table of Contents (13) Chapters

Deploying Spark on a cluster with Mesos

How to do it...

Authors (1)

Other recommended products

Personalised recommendations for you