Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Dancing with Python

You're reading from   Dancing with Python Learn to code with Python and Quantum Computing

Arrow left icon
Product type Paperback
Published in Aug 2021
Publisher Packt
ISBN-13 9781801077859
Length 744 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Robert S. Sutor Robert S. Sutor
Author Profile Icon Robert S. Sutor
Robert S. Sutor
Arrow right icon
View More author details
Toc

Table of Contents (29) Chapters Close

Preface 1. Chapter 1: Doing the Things That Coders Do 2. Part I: Getting to Know Python FREE CHAPTER
3. Chapter 2: Working with Expressions 4. Chapter 3: Collecting Things Together 5. Chapter 4: Stringing You Along 6. Chapter 5: Computing and Calculating 7. Chapter 6: Defining and Using Functions 8. Chapter 7: Organizing Objects into Classes 9. Chapter 8: Working with Files 10. PART II: Algorithms and Circuits
11. Chapter 9: Understanding Gates and Circuits 12. Chapter 10: Optimizing and Testing Your Code 13. Chapter 11: Searching for the Quantum Improvement 14. PART III: Advanced Features and Libraries
15. Chapter 12: Searching and Changing Text 16. Chapter 13: Creating Plots and Charts 17. Chapter 14: Analyzing Data 18. Chapter 15: Learning, Briefly 19. References
20. Other Books You May Enjoy
21. Index
Appendices
1. Appendix A: Tools 2. Appendix B: Staying Current 3. Appendix C: The Complete UniPoly Class
4. Appendix D: The Complete Guitar Class Hierarchy
5. Appendix E: Notices 6. Appendix F: Production Notes

15.5 Clustering

Suppose I have a CSV dataset containing 75 (x, y) geometric coordinates. I load these into the xy_df pandas DataFrame and look at its descriptive statistical summary:

xy_df = pd.read_csv("src/examples/clustering-xy.csv")
xy_df.describe()
            x       y
count 75.0000 75.0000
mean   7.5733  4.5401
std    4.0102  2.1265
min    1.9796  1.1947
25%    3.4182  2.8896
50%    7.0173  3.6819
75%   12.2170  6.9615
max   13.5643  8.2785

Here is the usual sample of the first five points:

xy_df.head()
        x      y
0 13.4832 3.2657
1  7.6388 7.0170
2  2.9279 2.9603
3  7.4514 6.4439
4  3.3011 2.4642

How are these points spread out geometrically? Are they uniformly distributed within their minimum and maximum ranges?

A scatter plot would help us see the distribution because we are in two dimensions, but let’s try to collect or cluster the points into k groups first. Here, k is...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime