You're reading from 15 Math Concepts Every Data Scientist Should Know Understand and learn how to apply the math behind data science algorithms

Product type Paperback

Published in Aug 2024

Publisher Packt

ISBN-13 9781837634187

Length 510 pages

Edition 1st Edition

Languages

Python

Tools

NumPy

Concepts

Data Science

Author (1):

David Hoyle

View More author details

Table of Contents (21) Chapters

Preface

1. Part 1: Essential Concepts FREE CHAPTER

2. Chapter 1: Recap of Mathematical Notation and Terminology

3. Chapter 2: Random Variables and Probability Distributions

4. Chapter 3: Matrices and Linear Algebra

5. Chapter 4: Loss Functions and Optimization

6. Chapter 5: Probabilistic Modeling

7. Part 2: Intermediate Concepts

8. Chapter 6: Time Series and Forecasting

9. Chapter 7: Hypothesis Testing

10. Chapter 8: Model Complexity

11. Chapter 9: Function Decomposition

12. Chapter 10: Network Analysis

13. Part 3: Selected Advanced Concepts

14. Chapter 11: Dynamical Systems

15. Chapter 12: Kernel Methods

16. Chapter 13: Information Theory

17. Chapter 14: Non-Parametric Bayesian Methods

18. Chapter 15: Random Matrices

19. Index

Why subscribe?

20. Other Books You May Enjoy

The kernel trick

To learn how the kernel trick allows us to do feature construction implicitly and efficiently, we will first have to learn what a kernel is.

What is a kernel?

The simplest way to think about a kernel is to consider it as a mapping that takes two vectors as input and returns a scalar. It is a mapping that maps <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><msup><mi mathvariant="double-struck">R</mi><mi>d</mi></msup><mo>×</mo><msup><mi mathvariant="double-struck">R</mi><mi>d</mi></msup><mo>→</mo><mi mathvariant="double-struck">R</mi></mrow></mrow></math> . This means that a kernel is a function , with the input vectors being <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and The value of is a real number. This means that the inner product is an example of a kernel function.

That is a high-level mathematical definition of what a kernel is, but what is the intuition behind this? An kernel function applied to the vectors <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and is typically used to measure the similarity between those vectors. Consequently, we usually want our kernel function to have its largest values when <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:munder underaccent="false"><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo>_</mml:mo></mml:munder></mml:math> and are most similar and its lowest values when and are least similar. We want the function to decrease smoothly and monotonically in between those two scenarios.