Packt+ | Advance your knowledge in tech

You're reading from IPython Interactive Computing and Visualization Cookbook Over 100 hands-on recipes to sharpen your skills in high-performance numerical computing and data science in the Jupyter Notebook

Product type Paperback

Published in Jan 2018

Publisher Packt

ISBN-13 9781785888632

Length 548 pages

Edition 2nd Edition

Languages

Python

Tools

IPython

Concepts

Data Analysis

Author (1):

Cyrille Rossant

View More author details

Table of Contents (17) Chapters

Preface

1. A Tour of Interactive Computing with Jupyter and IPython FREE CHAPTER

2. Best Practices in Interactive Computing

3. Mastering the Jupyter Notebook

4. Profiling and Optimization

5. High-Performance Computing

6. Data Visualization

7. Statistical Data Analysis

8. Machine Learning

9. Numerical Optimization

10. Signal Processing

11. Image and Audio Processing

12. Deterministic Dynamical Systems

13. Stochastic Dynamical Systems

14. Graphs, Geometry, and Geographic Information Systems

15. Symbolic and Numerical Mathematics

Index

Reducing the dimensionality of a dataset with a principal component analysis

In the previous recipes, we presented supervised learning methods; our data points came with discrete or continuous labels, and the algorithms were able to learn the mapping from the points to the labels.

Starting with this recipe, we will present unsupervised learning methods. These methods might be helpful prior to running a supervised learning algorithm. They can give a first insight into the data.

Let's assume that our data consists of points without any labels. The goal is to discover some form of hidden structure in this set of points. Frequently, data points have intrinsic low dimensionality: a small number of features suffice to accurately describe the data. However, these features might be hidden among many other features not relevant to the problem. Dimension reduction can help us find these structures. This knowledge can considerably improve the performance of subsequent supervised learning algorithms...