Summary
In this chapter, I gave my perspective on data science as a developer, discussing the reasons why I think that data science along with AI and Cloud has the potential to define the next era of computing. I also discussed the many problems that must be addressed before it can fully realize its potential. While this book doesn't pretend to provide a magic recipe that solves all these problems, it does try to answer the difficult but critical question of democratizing data science and more specifically bridging the gap between data scientists and developers.
In the next few chapters, we'll dive into the PixieDust open source library and learn how it can help Jupyter Notebooks users be more efficient when working with data. We'll also deep dive on the PixieApp application development framework that enables developers to leverage the analytics implemented in the Notebook to build application and dashboards.
In the remaining chapters, we will deep dive into many examples that show how data scientists and developers can collaborate effectively to build end-to-end data pipelines, iterate on the analytics, and deploy them to end users at a fraction of the time. The sample applications will cover many industry use-cases, such as image recognition, social media, and financial data analysis which include data science use cases like descriptive analytics, machine learning, natural language processing, and streaming data.
We will not discuss deeply the theory behind all the algorithms covered in the sample applications (which is beyond the scope of this book and would take more than one book to cover), but we will instead emphasize how to leverage the open source ecosystem to rapidly complete the task at hand (model building, visualization, and so on) and operationalize the results into applications and dashboards.
Note
The provided sample applications are written mostly in Python and come with complete source code. The code has been extensively tested and is ready to be re-used and customized in your own projects.