Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Apache Spark 2.x Machine Learning Cookbook

You're reading from   Apache Spark 2.x Machine Learning Cookbook Over 100 recipes to simplify machine learning model implementations with Spark

Arrow left icon
Product type Paperback
Published in Sep 2017
Publisher Packt
ISBN-13 9781783551606
Length 666 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (5):
Arrow left icon
Broderick Hall Broderick Hall
Author Profile Icon Broderick Hall
Broderick Hall
Meenakshi Rajendran Meenakshi Rajendran
Author Profile Icon Meenakshi Rajendran
Meenakshi Rajendran
Shuen Mei Shuen Mei
Author Profile Icon Shuen Mei
Shuen Mei
Mohammed Guller Mohammed Guller
Author Profile Icon Mohammed Guller
Mohammed Guller
Siamak Amirghodsi Siamak Amirghodsi
Author Profile Icon Siamak Amirghodsi
Siamak Amirghodsi
+1 more Show less
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Practical Machine Learning with Spark Using Scala FREE CHAPTER 2. Just Enough Linear Algebra for Machine Learning with Spark 3. Spark's Three Data Musketeers for Machine Learning - Perfect Together 4. Common Recipes for Implementing a Robust Machine Learning System 5. Practical Machine Learning with Regression and Classification in Spark 2.0 - Part I 6. Practical Machine Learning with Regression and Classification in Spark 2.0 - Part II 7. Recommendation Engine that Scales with Spark 8. Unsupervised Clustering with Apache Spark 2.0 9. Optimization - Going Down the Hill with Gradient Descent 10. Building Machine Learning Systems with Decision Tree and Ensemble Models 11. Curse of High-Dimensionality in Big Data 12. Implementing Text Analytics with Spark 2.0 ML Library 13. Spark Streaming and Machine Learning Library

Preface

 

Education is not the learning of facts,
but the training of the mind to think.
- Albert Einstein

Data is the new silicon of our age, and machine learning, coupled with biologically inspired cognitive systems, serves as the core foundation to not only enable but also accelerate the birth of the fourth industrial revolution. This book is dedicated to our parents, who through extreme hardship and sacrifice, made our education possible and taught us to always practice kindness.

The Apache Spark 2.x Machine Learning Cookbook is crafted by four friends with diverse background, who bring in a vast experience across multiple industries and academic disciplines. The team has immense experience in the subject matter at hand. The book is as much about friendship as it is about the science underpinning Spark and Machine Learning. We wanted to put our thoughts together and write a book for the community that not only combines Spark’s ML code and real-world data sets but also provides context-relevant explanation, references, and readings for a deeper understanding and promoting further research. This book is a reflection of what our team would have wished to have when we got started with Apache Spark.

My own interest in machine learning and artificial intelligence started in the mid eighties when I had the opportunity to read two significant artifacts that happened to be listed back to back in Artificial Intelligence, An International Journal, Volume 28, Number 1, February 1986. While it has been a long journey for engineers and scientists of my generation, fortunately, the advancements in resilient distributed computing, cloud computing, GPUs, cognitive computing, optimization, and advanced machine learning have made the dream of long decades come true. All these advancements have become accessible for the current generation of ML enthusiasts and data scientists alike.

We live in one of the rarest periods in history--a time when multiple technological and sociological trends have merged at the same point in time. The elasticity of cloud computing with built-in access to ML and deep learning nets will provide a whole new set of opportunities to create and capture new markets. The emergence of Apache Spark as the lingua franca or the common language of near real-time resilient distributed computing and data virtualization has provided smart companies the opportunity to employ ML techniques at a scale without a heavy investment in specialized data centers or hardware.

The Apache Spark 2.x Machine Learning Cookbook is one of the most comprehensive treatments of the Apache Spark machine learning API, with selected subcomponents of Spark to give you the foundation you need before you can master a high-end career in machine learning and Apache Spark. The book is written with the goal of providing clarity and accessibility, and it reflects our own experience (including reading the source code) and learning curve with Apache Spark, which started with Spark 1.0.

The Apache Spark 2.x Machine Learning Cookbook lives at the intersection of Apache Spark, machine learning, and Scala for developers, and data scientists through a practitioner’s lens who not only has to understand the code but also the details, theory, and inner workings of a given Spark ML algorithm or API to establish a successful career in the new economy.

The book takes the cookbook format to a whole new level by blending downloadable ready-to-run Apache Spark ML code recipes with background, actionable theory, references, research, and real-life data sets to help the reader understand the what, how and the why behind the extensive facilities offered by Spark for the machine learning library. The book starts by laying the foundations needed to succeed and then rapidly evolves to cover all the meaningful ML algorithms available in Apache Spark.

lock icon The rest of the chapter is locked
Next Section arrow right
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at R$50/month. Cancel anytime