Preface
In this smart age, data analytics is the key to sustaining and promoting business growth. Every business is trying to leverage their data as much possible with all sorts of data science tools and techniques to progress along the analytics maturity curve. This sudden rise in data science requirements is the obvious reason for scarcity of data scientists. It is very difficult to meet the market demand with unicorn data scientists who are experts in statistics, machine learning, mathematical modelling as well as programming.
The availability of unicorn data scientists is only going to decrease with the increase in market demand, and it will continue to be so. So, a solution was needed which not only empowers the unicorn data scientists to do more, but also creates what Gartner calls as “Citizen Data Scientists”. Citizen data scientists are none other than the developers, analysts, BI professionals or other technologists whose primary job function is outside of statistics or analytics but are passionate enough to learn data science. They are becoming the key enabler in democratizing data analytics across organizations and industries as a whole.
There is an ever going plethora of tools and techniques designed to facilitate big data analytics at scale. This book is an attempt to create citizen data scientists who can leverage Apache Spark’s distributed computing platform for data analytics.
This book is a practical guide to learn statistical analysis and machine learning to build scalable data products. It helps to master the core concepts of data science and also Apache Spark to help you jump start on any real life data analytics project. Throughout the book, all the chapters are supported by sufficient examples, which can be executed on a home computer, so that readers can easily follow and absorb the concepts. Every chapter attempts to be self-contained so that the reader can start from any chapter with pointers to relevant chapters for details. While the chapters start from basics for a beginner to learn and comprehend, it is comprehensive enough for a senior architects at the same time.