You're reading from 15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Product type Paperback

Published in Aug 2024

Publisher Packt

ISBN-13 9781837634187

Length 510 pages

Edition 1st Edition

Languages

Python

Tools

NumPy

Concepts

Data Science

Author (1):

David Hoyle

View More author details

Table of Contents (21) Chapters

Preface

1. Part 1: Essential Concepts

2. Chapter 1: Recap of Mathematical Notation and Terminology FREE CHAPTER

3. Chapter 2: Random Variables and Probability Distributions

4. Chapter 3: Matrices and Linear Algebra

5. Chapter 4: Loss Functions and Optimization

6. Chapter 5: Probabilistic Modeling

7. Part 2: Intermediate Concepts

8. Chapter 6: Time Series and Forecasting

9. Chapter 7: Hypothesis Testing

10. Chapter 8: Model Complexity

11. Chapter 9: Function Decomposition

12. Chapter 10: Network Analysis

13. Part 3: Selected Advanced Concepts

14. Chapter 11: Dynamical Systems

15. Chapter 12: Kernel Methods

16. Chapter 13: Information Theory

17. Chapter 14: Non-Parametric Bayesian Methods

18. Chapter 15: Random Matrices

19. Index

Why subscribe?

20. Other Books You May Enjoy

Likelihood

To understand the probability distribution that the data follows, we’ll look at an explicit example of how a random component is incorporated into data.

A simple probabilistic model

We’ll start with the simplest way in which we can introduce a random component into our observations of the response (target) variable <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>y</mml:mi></mml:math> , namely by adding noise to a deterministic quantity. In fact, we’ll just consider the observations in our dataset to be noise-corrupted versions of a model output <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mover><mi>y</mi><mo stretchy="true">ˆ</mo></mover><mfenced open="(" close=")"><mrow><msub><munder><mi>x</mi><mo stretchy="true">_</mo></munder><mi>i</mi></msub><mo>|</mo><munder><mi>β</mi><mo stretchy="true">_</mo></munder></mrow></mfenced></mrow></mrow></math> . So, we have this relationship:

Eq. 1

Here, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mi>ε</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math> is the noise value that has been added to the model output to get the observation <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math> for the datapoint. The value is a random variable. Without loss of generality, we can assume its expectation value is zero, so we have <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi mathvariant="double-struck">E</mml:mi><mml:mfenced separators="|"><mml:mrow><mml:msub><mml:mrow><mml:mi>ε</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfenced><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math> . We can make this assumption because if the expectation of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mi>ε</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math> was non-zero, it would mean we have a non-zero deterministic average contribution from <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mi>ε</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math> that we could just absorb into the definition of ...