Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Data Science Algorithms in a Week

You're reading from   Data Science Algorithms in a Week Top 7 algorithms for scientific computing, data analysis, and machine learning

Arrow left icon
Product type Paperback
Published in Oct 2018
Publisher Packt
ISBN-13 9781789806076
Length 214 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
David Toth David Toth
Author Profile Icon David Toth
David Toth
David Natingga David Natingga
Author Profile Icon David Natingga
David Natingga
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Classification Using K-Nearest Neighbors FREE CHAPTER 2. Naive Bayes 3. Decision Trees 4. Random Forests 5. Clustering into K Clusters 6. Regression 7. Time Series Analysis 8. Python Reference 9. Statistics 10. Glossary of Algorithms and Methods in Data Science
11. Other Books You May Enjoy

Gender classification – clustering to classify


The following data is taken from the gender classification example, Problem 6, Chapter 2, Naive Bayes:

Height in cm

Weight in kg

Hair length

Gender

180

75

Short

Male

174

71

Short

Male

184

83

Short

Male

168

63

Short

Male

178

70

Long

Male

170

59

Long

Female

164

53

Short

Female

155

46

Long

Female

162

52

Long

Female

166

55

Long

Female

172

60

Long

?

 

To simplify matters, we will remove the column entitled Hair length. We will also remove the column entitled Gender, since we would like to cluster the people in the table based on their height and weight. We would like to establish whether the eleventh person in the table is more likely to be a man or a woman using clustering:

Height in cm

Weight in kg

180

75

174

71

184

83

168

63

178

70

170

59

164

53

155

46

162

52

166

55

172

60

Analysis

We may apply scaling to the initial data, but to simplify matters, we will use the unscaled data in the algorithm. We will cluster the data we have into two clusters, since there are two possibilities for gender—male or female. Then, we will...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image