You're reading from Practical Discrete Mathematics Discover math principles that fuel algorithms for computer science and machine learning with Python

Product type Paperback

Published in Feb 2021

Publisher Packt

ISBN-13 9781838983147

Length 330 pages

Edition 1st Edition

Languages

Python

Tools

NumPy

Concepts

Data Science

Authors (2):

Ryan T. White

Archana Tikayat Ray

View More author details

Table of Contents (17) Chapters

Preface

1. Part I – Basic Concepts of Discrete Math

2. Chapter 1: Key Concepts, Notation, Set Theory, Relations, and Functions FREE CHAPTER

3. Chapter 2: Formal Logic and Constructing Mathematical Proofs

4. Chapter 3: Computing with Base-n Numbers

5. Chapter 4: Combinatorics Using SciPy

6. Chapter 5: Elements of Discrete Probability

7. Part II – Implementing Discrete Mathematics in Data and Computer Science

8. Chapter 6: Computational Algorithms in Linear Algebra

9. Chapter 7: Computational Requirements for Algorithms

10. Chapter 8: Storage and Feature Extraction of Graphs, Trees, and Networks

11. Chapter 9: Searching Data Structures and Finding Shortest Paths

12. Part III – Real-World Applications of Discrete Mathematics

13. Chapter 10: Regression Analysis with NumPy and Scikit-Learn

14. Chapter 11: Web Searches with PageRank

15. Chapter 12: Principal Component Analysis with Scikit-Learn

16. Other Books You May Enjoy

Leave a review - let other readers know what you think

An application to real-world data

In this section, we will apply PCA to the MNIST dataset. The MNIST dataset is one of the most famous datasets in machine learning and contains handwritten digits that are used to train image processing algorithms. We will be using version 1 of the dataset, where each picture of every digit has 784 features. We will transform these features into a 28 x 28 matrix for visualization purposes. Each element of this matrix is a number between 0 (white) and 255 (black).

The first step is to import the data as shown in the following code. It is going to take some time since it is a big dataset, so hang tight. The dataset contains images of 70,000 digits (0-9), and each image has 784 features:

#Importing the dataset
from sklearn.datasets import fetch_openml
mnist_data = fetch_openml('mnist_784', version = 1)
# Choosing the independent (X) and dependent variables (y)
X,y = mnist_data["data"], mnist_data["target"]

Now...