Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Spark for Data Science

You're reading from   Spark for Data Science Analyze your data and delve deep into the world of machine learning with the latest Spark version, 2.0

Arrow left icon
Product type Paperback
Published in Sep 2016
Publisher Packt
ISBN-13 9781785885655
Length 344 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Bikramaditya Singhal Bikramaditya Singhal
Author Profile Icon Bikramaditya Singhal
Bikramaditya Singhal
Srinivas Duvvuri Srinivas Duvvuri
Author Profile Icon Srinivas Duvvuri
Srinivas Duvvuri
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Big Data and Data Science – An Introduction FREE CHAPTER 2. The Spark Programming Model 3. Introduction to DataFrames 4. Unified Data Access 5. Data Analysis on Spark 6. Machine Learning 7. Extending Spark with SparkR 8. Analyzing Unstructured Data 9. Visualizing Big Data 10. Putting It All Together 11. Building Data Science Applications

Preface

In this smart age, data analytics is the key to sustaining and promoting business growth. Every business is trying to leverage their data as much possible with all sorts of data science tools and techniques to progress along the analytics maturity curve. This sudden rise in data science requirements is the obvious reason for scarcity of data scientists. It is very difficult to meet the market demand with unicorn data scientists who are experts in statistics, machine learning, mathematical modelling as well as programming.

The availability of unicorn data scientists is only going to decrease with the increase in market demand, and it will continue to be so. So, a solution was needed which not only empowers the unicorn data scientists to do more, but also creates what Gartner calls as  “Citizen Data Scientists”. Citizen data scientists are none other than the developers, analysts, BI professionals or other technologists whose primary job function is outside of statistics or analytics but are passionate enough to learn data science. They are becoming the key enabler in democratizing data analytics across organizations and industries as a whole.

There is an ever going plethora of tools and techniques designed to facilitate big data analytics at scale. This book is an attempt to create citizen data scientists who can leverage Apache Spark’s distributed computing platform for data analytics.

This book is a practical guide to learn statistical analysis and machine learning to build scalable data products. It helps to master the core concepts of data science and also Apache Spark to help you jump start on any real life data analytics project. Throughout the book, all the chapters are supported by sufficient examples, which can be executed on a home computer, so that readers can easily follow and absorb the concepts. Every chapter attempts to be self-contained so that the reader can start from any chapter with pointers to relevant chapters for details. While the chapters start from basics for a beginner to learn and comprehend, it is comprehensive enough for a senior architects at the same time.

lock icon The rest of the chapter is locked
Next Section arrow right
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image