You're reading from 15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Product type Paperback

Published in Aug 2024

Publisher Packt

ISBN-13 9781837634187

Length 510 pages

Edition 1st Edition

Languages

Python

Tools

NumPy

Concepts

Data Science

Author (1):

David Hoyle

View More author details

Table of Contents (21) Chapters

Preface

1. Part 1: Essential Concepts

2. Chapter 1: Recap of Mathematical Notation and Terminology FREE CHAPTER

3. Chapter 2: Random Variables and Probability Distributions

4. Chapter 3: Matrices and Linear Algebra

5. Chapter 4: Loss Functions and Optimization

6. Chapter 5: Probabilistic Modeling

7. Part 2: Intermediate Concepts

8. Chapter 6: Time Series and Forecasting

9. Chapter 7: Hypothesis Testing

10. Chapter 8: Model Complexity

11. Chapter 9: Function Decomposition

12. Chapter 10: Network Analysis

13. Part 3: Selected Advanced Concepts

14. Chapter 11: Dynamical Systems

15. Chapter 12: Kernel Methods

16. Chapter 13: Information Theory

17. Chapter 14: Non-Parametric Bayesian Methods

18. Chapter 15: Random Matrices

19. Index

Why subscribe?

20. Other Books You May Enjoy

The Central Limit Theorem

Earlier in the chapter, when we were introducing specific continuous-valued distributions, we described the Gaussian or normal distribution and we said that it was an extremely important distribution because it was an extremely common distribution. By this, we meant that many datasets you will encounter will effectively have been drawn from a normal distribution, or you will use a normal distribution to model those datasets. We will now explain why.

Sums of random variables

Lots of the quantities we analyze as data scientists are aggregations of other data. Aggregating observations over some dimension to simplify the data is a very natural thing to do.

For example, consider our e-commerce scenario where we are interested in how many items are sold. The number of items sold on any day of the year we might model as a binomial random variable, but what about for the whole year? Imagine we have a relatively niche website where we only get, say, 20 visitors...

The rest of the chapter is locked

You're reading from 15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Table of Contents (21) Chapters

The Central Limit Theorem

Sums of random variables

Authors (1)

Personalised recommendations for you

You're reading from 15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Table of Contents (21) Chapters

The Central Limit Theorem

Sums of random variables

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you