You're reading from 15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Product type Paperback

Published in Aug 2024

Publisher Packt

ISBN-13 9781837634187

Length 510 pages

Edition 1st Edition

Languages

Python

Tools

NumPy

Concepts

Data Science

Author (1):

David Hoyle

View More author details

Table of Contents (21) Chapters

Preface

1. Part 1: Essential Concepts FREE CHAPTER

2. Chapter 1: Recap of Mathematical Notation and Terminology

3. Chapter 2: Random Variables and Probability Distributions

4. Chapter 3: Matrices and Linear Algebra

5. Chapter 4: Loss Functions and Optimization

6. Chapter 5: Probabilistic Modeling

7. Part 2: Intermediate Concepts

8. Chapter 6: Time Series and Forecasting

9. Chapter 7: Hypothesis Testing

10. Chapter 8: Model Complexity

11. Chapter 9: Function Decomposition

12. Chapter 10: Network Analysis

13. Part 3: Selected Advanced Concepts

14. Chapter 11: Dynamical Systems

15. Chapter 12: Kernel Methods

16. Chapter 13: Information Theory

17. Chapter 14: Non-Parametric Bayesian Methods

18. Chapter 15: Random Matrices

19. Index

Why subscribe?

20. Other Books You May Enjoy

Sampling from distributions

So far, we’ve learned a lot about random variables, probability distributions, and how to calculate some of the key characteristics of a distribution such as its mean and variance, and we’ve learned about some commonly occurring distributions. But so far, it doesn’t feel like we’ve learned much about data. We’ll now change that.

How datasets relate to random variables and probability distributions

We said at the beginning of this chapter that all data is random. This means when data is captured or generated, we are drawing or sampling values from some underlying probability distribution. This is illustrated schematically in Figure 2.10:

Figure 2.10: Diagram illustrating how real data is generated as samples from a population

A sample is finite. It represents a snapshot or subset of the entirety of possible outcomes; for example, a subset of all users who might visit a website. But from...

The rest of the chapter is locked

You're reading from 15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Table of Contents (21) Chapters

Sampling from distributions

How datasets relate to random variables and probability distributions

Authors (2)

Personalised recommendations for you

You're reading from 15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Table of Contents (21) Chapters

Sampling from distributions

How datasets relate to random variables and probability distributions

Unlock this book and the full library FREE for 7 days

Authors (2)

Personalised recommendations for you