Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Machine Learning with scikit-learn Quick Start Guide Classification, regression, and clustering techniques in Python

Product type Paperback

Published in Oct 2018

Publisher Packt

ISBN-13 9781789343700

Length 172 pages

Edition 1st Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Author (1):

Kevin Jolly

View More author details

Table of Contents (10) Chapters

Preface

1. Introducing Machine Learning with scikit-learn

2. Predicting Categories with K-Nearest Neighbors FREE CHAPTER

3. Predicting Categories with Logistic Regression

4. Predicting Categories with Naive Bayes and SVMs

5. Predicting Numeric Outcomes with Linear Regression

6. Classification and Regression with Trees

7. Clustering Data with Unsupervised Machine Learning

8. Performance Evaluation Methods

9. Other Books You May Enjoy

Leave a review - let other readers know what you think

Fine-tuning the parameters of the k-NN algorithm

In the previous section, we arbitrarily set the number of neighbors to three while initializing the k-NN classifier. However, is this the optimal value? Well, it could be, since we obtained a relatively high accuracy score in the test set.

Our goal is to create a machine learning model that does not overfit or underfit the data. Overfitting the data means that the model has been trained very specifically to the training examples provided and will not generalize well to cases/examples of data that it has not encountered before. For instance, we might have fit the model very specifically to the training data, with the test cases being also very similar to the training data. Thus, the model would have been able to perform very well and produce a very high value of accuracy.

Underfitting is another extreme case, in which the model...

You have been reading a chapter from

Machine Learning with scikit-learn Quick Start Guide

Published in: Oct 2018

Publisher: Packt

ISBN-13: 9781789343700

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (1)

Jolly

Kevin Jolly is a formally educated data scientist with a master's degree in data science from the prestigious King's College London. Kevin works as a statistical analyst with a digital healthcare start-up, Connido Limited, in London, where he is primarily involved in leading the data science projects that the company undertakes. He has built machine learning pipelines for small and big data, with a focus on scaling such pipelines into production for the products that the company has built. Kevin is also the author of a book titled Hands-On Data Visualization with Bokeh, published by Packt. He is the editor-in-chief of Linear, a weekly online publication on data science software and products.

See other products by Jolly