You're reading from Practical Discrete Mathematics Discover math principles that fuel algorithms for computer science and machine learning with Python

Product type Paperback

Published in Feb 2021

Publisher Packt

ISBN-13 9781838983147

Length 330 pages

Edition 1st Edition

Languages

Python

Tools

NumPy

Concepts

Data Science

Authors (2):

Ryan T. White

Archana Tikayat Ray

View More author details

Table of Contents (17) Chapters

Preface

1. Part I – Basic Concepts of Discrete Math

2. Chapter 1: Key Concepts, Notation, Set Theory, Relations, and Functions FREE CHAPTER

3. Chapter 2: Formal Logic and Constructing Mathematical Proofs

4. Chapter 3: Computing with Base-n Numbers

5. Chapter 4: Combinatorics Using SciPy

6. Chapter 5: Elements of Discrete Probability

7. Part II – Implementing Discrete Mathematics in Data and Computer Science

8. Chapter 6: Computational Algorithms in Linear Algebra

9. Chapter 7: Computational Requirements for Algorithms

10. Chapter 8: Storage and Feature Extraction of Graphs, Trees, and Networks

11. Chapter 9: Searching Data Structures and Finding Shortest Paths

12. Part III – Real-World Applications of Discrete Mathematics

13. Chapter 10: Regression Analysis with NumPy and Scikit-Learn

14. Chapter 11: Web Searches with PageRank

15. Chapter 12: Principal Component Analysis with Scikit-Learn

16. Other Books You May Enjoy

Leave a review - let other readers know what you think

The scikit-learn implementation of PCA

In this section, we will apply PCA to the pizza.csv dataset (which we explored in the first section of this chapter) using the scikit-learn library's PCA class.

As discussed in the previous section, there are two ways of choosing how many principal components to use, and the choice depends on the goal that you are trying to achieve – whether to reduce the dimensionality to plot something in 2-dimensional/3-dimensional space or keep enough principal components to achieve a certain proportion of variance.

First, we will implement the method where we can select the number of principal components we want to keep. We will reduce the 7-dimensional pizza dataset to two principal components so that we can visualize how the different pizzas produced by 10 different companies are different from each other when it comes to their nutritional content in a 2D plot instead of worrying about comparing and visualizing data in higher dimensions...