You're reading from 15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Product type Paperback

Published in Aug 2024

Publisher Packt

ISBN-13 9781837634187

Length 510 pages

Edition 1st Edition

Languages

Python

Tools

NumPy

Concepts

Data Science

Author (1):

David Hoyle

View More author details

Table of Contents (21) Chapters

Preface

1. Part 1: Essential Concepts FREE CHAPTER

2. Chapter 1: Recap of Mathematical Notation and Terminology

3. Chapter 2: Random Variables and Probability Distributions

4. Chapter 3: Matrices and Linear Algebra

5. Chapter 4: Loss Functions and Optimization

6. Chapter 5: Probabilistic Modeling

7. Part 2: Intermediate Concepts

8. Chapter 6: Time Series and Forecasting

9. Chapter 7: Hypothesis Testing

10. Chapter 8: Model Complexity

11. Chapter 9: Function Decomposition

12. Chapter 10: Network Analysis

13. Part 3: Selected Advanced Concepts

14. Chapter 11: Dynamical Systems

15. Chapter 12: Kernel Methods

16. Chapter 13: Information Theory

17. Chapter 14: Non-Parametric Bayesian Methods

18. Chapter 15: Random Matrices

19. Index

Why subscribe?

20. Other Books You May Enjoy

Exercises

Next is a series of exercises. Answers to all the exercises are given in the Answers_to_Exercises_Chap4.ipynb Jupyter notebook in the GitHub repository:

Look at the documentation for the scikit-learn class named sklearn.linear_model.LinearRegression, which can fit a linear model using OLS regression. See if you can use it to fit a linear model to the power-plant output data that we analyzed in the code example in the Linear models section of this chapter. Do you get the same parameter estimates as when we used the statsmodels package?
The data plotted in Figure 4.3 is stored in the Data/outliers_example.csv file of the GitHub repository. Using the pseudo-Huber loss function in Eq. 12 and a learning rate of , see if you can use the simple gradient descent algorithm to construct robust estimates for both the intercept and the slope for a linear model of the data.
The data in the Data/nls_example.csv file of the GitHub repository contains data that has been generated...