Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering Machine Learning with scikit-learn. - Second Edition

You're reading from  Mastering Machine Learning with scikit-learn. - Second Edition

Product type Book
Published in Jul 2017
Publisher
ISBN-13 9781788299879
Pages 254 pages
Edition 2nd Edition
Languages
Author (1):
Gavin Hackeling Gavin Hackeling
Profile icon Gavin Hackeling
Toc

Table of Contents (22) Chapters close

Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface
1. The Fundamentals of Machine Learning 2. Simple Linear Regression 3. Classification and Regression with k-Nearest Neighbors 4. Feature Extraction 5. From Simple Linear Regression to Multiple Linear Regression 6. From Linear Regression to Logistic Regression 7. Naive Bayes 8. Nonlinear Classification and Regression with Decision Trees 9. From Decision Trees to Random Forests and Other Ensemble Methods 10. The Perceptron 11. From the Perceptron to Support Vector Machines 12. From the Perceptron to Artificial Neural Networks 13. K-means 14. Dimensionality Reduction with Principal Component Analysis Index

Regression with KNN


Now let's use KNN for a regression task. Let's use a person's height and sex to predict their weight. The following tables list our training and testing sets:

Height

Sex

Weight

158 cm

male

64 kg

170 cm

male

66 kg

183 cm

male

84 kg

191 cm

male

80 kg

155 cm

female

49 kg

163 cm

female

59 kg

180 cm

female

67 kg

158 cm

female

54 kg

178 cm

female

77 kg

Height

Sex

Weight

168 cm

male

65 kg

170 cm

male

61 kg

160 cm

female

52 kg

169 cm

female

67 kg

We will instantiate and fit KNeighborsRegressor, and use it to predict weights. In this dataset, sex has already been coded as a binary-valued feature. Notice that this feature ranges from 0 to 1, while the values of the feature representing the person's height range from 155 to 191. We will discuss why this is a problem, and how it can be ameliorated, in the next section. In the pizza price problem, we used the coefficient of determination to measure the performance of our model. We will use it to measure the performance of our regressor again, and introduce two more performance...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at ₹800/month. Cancel anytime