Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
15 Math Concepts Every Data Scientist Should Know

You're reading from   15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Arrow left icon
Product type Paperback
Published in Aug 2024
Publisher Packt
ISBN-13 9781837634187
Length 510 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
David Hoyle David Hoyle
Author Profile Icon David Hoyle
David Hoyle
Arrow right icon
View More author details
Toc

Table of Contents (21) Chapters Close

Preface 1. Part 1: Essential Concepts FREE CHAPTER
2. Chapter 1: Recap of Mathematical Notation and Terminology 3. Chapter 2: Random Variables and Probability Distributions 4. Chapter 3: Matrices and Linear Algebra 5. Chapter 4: Loss Functions and Optimization 6. Chapter 5: Probabilistic Modeling 7. Part 2: Intermediate Concepts
8. Chapter 6: Time Series and Forecasting 9. Chapter 7: Hypothesis Testing 10. Chapter 8: Model Complexity 11. Chapter 9: Function Decomposition 12. Chapter 10: Network Analysis 13. Part 3: Selected Advanced Concepts
14. Chapter 11: Dynamical Systems 15. Chapter 12: Kernel Methods 16. Chapter 13: Information Theory 17. Chapter 14: Non-Parametric Bayesian Methods 18. Chapter 15: Random Matrices 19. Index 20. Other Books You May Enjoy

What this book covers

Chapter 1, Recap of Mathematical Notation and Terminology, provides a summary of the main mathematical notation you will encounter in this book and that we expect you to already be familiar with.

Chapter 2, Random Variables and Probability Distributions, introduces the idea that all data contains some degree of randomness, and that random variables and their associated probability distributions are the natural way to describe that randomness. The chapter teaches you how to sample from a probability distribution, understand statistical estimators, and about the Central Limit Theorem.

Chapter 3, Matrices and Linear Algebra, introduces vectors and matrices as the basic mathematical structures we use to represent and transform data. It then shows how matrices can be broken down into simple-to-understand parts using techniques such as eigen-decomposition and singular value decomposition. The chapter finishes with explanations of how these decomposition methods are applied to principal component analysis (PCA) and non-negative matrix factorization (NMF).

Chapter 4, Loss Functions and Optimization, starts by introducing loss functions, risk functions, and empirical risk functions. The concept of minimizing an empirical risk function to estimate the parameters of a model is explained, before introducing Ordinary Least Squares estimation of linear models. Finally, gradient descent is illustrated as a general technique for minimizing risk functions.

Chapter 5, Probabilistic Modeling, introduces the concept of building predictive models that explicitly account for the random component within data. The chapter starts by introducing likelihood and maximum likelihood estimation, before introducing Bayes’ theorem and Bayesian inference. The chapter finishes with an illustration of Markov Chain Monte Carlo and importance sampling from the posterior distribution of a model’s parameters.

Chapter 6, Time Series and Forecasting, introduces time series data and the concept of auto-correlation as the main characteristic that distinguishes time series data from other types of data. It then describes the classical ARIMA approach to modeling time series data. Finally, it ends with a summary of concepts behind modern machine learning approaches to time series analysis.

Chapter 7, Hypothesis Testing, introduces what a hypothesis test is and why they are important in data science. The general form of a hypothesis test is outlined before the concepts of statistical significance and p-values are explained in depth. Next, confidence intervals and their interpretation are introduced. The chapter ends with an explanation of Type-I and Type-II errors, and power calculations.

Chapter 8, Model Complexity, introduces the concept of how we describe and quantify model complexity and discusses its impact on the predictive accuracy of a model. The classical bias-variance trade-off view of model complexity is introduced, along with the phenomenon of double descent. The chapter finishes with an explanation of model complexity measures for model selection.

Chapter 9, Function Decomposition, introduces the idea of decomposing or building up a function from a set of simpler basis functions. A general approach is explained first before the chapter moves on to introducing Fourier Series, Fourier Transforms, and the Discrete Fourier Transform.

Chapter 10, Network Analysis, introduces networks, network data, and the concept that a network is a graph. The node-edge description of a graph, along with its adjacency matrix representation is explained. Next, the chapter describes different types of common graphs and their properties. Finally, the decomposition of a graph into sub-graphs or communities is explained, and various community detection algorithms are illustrated.

Chapter 11, Dynamical Systems, introduces what a dynamical system is and explains how its dynamics are controlled by an evolution equation. The chapter then focuses on discrete Markov processes as these are the most common dynamical systems used by data scientists. First-order discrete Markov processes are explained in depth, before higher-order Markov processes are introduced. The chapter finishes with an explanation of Hidden Markov Models and a discussion of how they can be used in commercial data science applications.

Chapter 12, Kernel Methods, starts by introducing inner-product-based learning algorithms, then moves on to explaining kernels and the kernel trick. The chapter ends with an illustration of a kernelized learning algorithm. Throughout the chapter, we emphasize how the kernel trick allows us to implicitly and efficiently construct new features and thereby uncover any non-linear structure present in a dataset.

Chapter 13, Information Theory, introduces the concept of information and how it is measured mathematically. The main information theory concepts of entropy, conditional entropy, mutual information, and relative entropy are then explained, before practical uses of the Kullback-Leibler divergence are illustrated.

Chapter 14, Bayesian Non-Parametric Methods, introduces the idea of using a Bayesian prior over functions when building probabilistic models. The idea is illustrated through Gaussian Processes and Gaussian Process Regression. The chapter then introduces Dirichlet Processes and how they can be used as priors for probability distributions.

Chapter 15, Random Matrices, introduces what a random matrix is and why they are ubiquitous in science and data science. The universal properties of large random matrices are illustrated along with the classical Gaussian random matrix ensembles. The chapter finishes with a discussion of where large random matrices occur in statistical and machine learning models.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime