Introducing Data Science
Data science is not a new term; in fact, it was coined in the 1960s by Peter Naur, a Danish computer science pioneer who used the term data science to describe the process of working with data in various fields, including mathematics, statistics, and computer science.
Later, the modern use of data science began to take shape in the 1990s and early 2000s, and data scientist, as a profession, became more and more common across different industries.
With the exponential rise in artificial intelligence, one may think that data science is less relevant.
However, the scientific approach to understanding data, which defines data science, is the bedrock upon which successful machine learning and artificial intelligence-based solutions can be built.
Within this book, we will explore these different terms, provide a solid foundation in core statistical and machine learning theory, and concepts that can be applied to statistical, machine learning and artificial intelligence-based models alike, and walk through how to lead data science teams and projects to successful outcomes.
This first chapter introduces the reader to how statistics and data science are intertwined, and some fundamental concepts in statistics which can help you in working with data.
We will explore the differences between data science, artificial intelligence, and machine learning, explain the relationship between statistics and data science, explain the concepts of descriptive and inferential statistics, as well as probability, and basic methods to understand the shape (distribution) of data.
While some readers may find this chapter covering basic, foundational knowledge, the aim is to provide all readers, especially those from less technical backgrounds, with a solid understanding of these concepts before diving deeper into the world of data science. For more experienced readers, this chapter serves as a quick refresher and helps establish a common language that will be used throughout the book.
In this next section, let's look at these terms of data science, artificial intelligence, and machine learning, how they are related, and how they differ.
This chapter covers the following topics:
- Data science, AI, and ML – what’s the difference?
- Statistics and data science
- Descriptive and inferential statistics
- Probability
- Describing our samples
- Probability distributions