Packt+ | Advance your knowledge in tech

You're reading from IPython Interactive Computing and Visualization Cookbook Harness IPython for powerful scientific computing and Python data visualization with this collection of more than 100 practical data science recipes

Product type Paperback

Published in Sep 2014

Publisher

ISBN-13 9781783284818

Length 512 pages

Edition 1st Edition

Languages

Python

Tools

NumPy

Concepts

Data Visualization

Author (1):

Cyrille Rossant

View More author details

Table of Contents (17) Chapters

Preface

1. A Tour of Interactive Computing with IPython FREE CHAPTER

2. Best Practices in Interactive Computing

3. Mastering the Notebook

4. Profiling and Optimization

5. High-performance Computing

6. Advanced Visualization

7. Statistical Data Analysis

8. Machine Learning

9. Numerical Optimization

10. Signal Processing

11. Image and Audio Processing

12. Deterministic Dynamical Systems

13. Stochastic Dynamical Systems

14. Graphs, Geometry, and Geographic Information Systems

15. Symbolic and Numerical Mathematics

Index

Detecting hidden structures in a dataset with clustering

A large part of unsupervised learning is devoted to the clustering problem. The goal is to group similar points together in a totally unsupervised way. Clustering is a hard problem, as the very definition of clusters (or groups) is not necessarily well posed. In most datasets, stating that two points should belong to the same cluster may be context-dependent or even subjective.

There are many clustering algorithms. We will see a few of them in this recipe, applied to a toy example.

How to do it...

Let's import the libraries:

In [1]: from itertools import permutations
        import numpy as np
        import sklearn
        import sklearn.decomposition as dec
        import sklearn.cluster as clu
        import sklearn.datasets as ds
        import sklearn.grid_search as gs
        import matplotlib.pyplot as plt
        %matplotlib inline

Let's generate a random dataset with three clusters:

In [2]: X, y = ds.make_blobs(n_samples=200, n_features...

The rest of the chapter is locked

You're reading from IPython Interactive Computing and Visualization Cookbook Harness IPython for powerful scientific computing and Python data visualization with this collection of more than 100 practical data science recipes

Table of Contents (17) Chapters

Detecting hidden structures in a dataset with clustering

How to do it...

Authors (1)

Personalised recommendations for you

You're reading from IPython Interactive Computing and Visualization Cookbook Harness IPython for powerful scientific computing and Python data visualization with this collection of more than 100 practical data science recipes

Table of Contents (17) Chapters

Detecting hidden structures in a dataset with clustering

How to do it...

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you