Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Free Learning

You're reading from scikit-learn Cookbook , Second Edition Over 80 recipes for machine learning in Python with scikit-learn

Product type Paperback

Published in Nov 2017

Publisher Packt

ISBN-13 9781787286382

Length 374 pages

Edition 2nd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Authors (2):

Trent Hauck

Julian Avila

View More author details

Table of Contents (13) Chapters

Preface

1. High-Performance Machine Learning – NumPy FREE CHAPTER

2. Pre-Model Workflow and Pre-Processing

3. Dimensionality Reduction

4. Linear Models with scikit-learn

5. Linear Models – Logistic Regression

6. Building Models with Distance Metrics

7. Cross-Validation and Post-Model Workflow

8. Support Vector Machines

9. Tree Algorithms and Ensembles

10. Text and Multiclass Classification with scikit-learn

11. Neural Networks

12. Create a Simple Estimator

Introduction

In this chapter, we'll learn how to make predictions with scikit-learn. Machine learning emphasizes on measuring the ability to predict, and with scikit-learn we will predict accurately and quickly.

We will examine the iris dataset, which consists of measurements of three types of Iris flowers: Iris Setosa, Iris Versicolor, and Iris Virginica.

To measure the strength of the predictions, we will:

Save some data for testing
Build a model using only training data
Measure the predictive power on the test set

The prediction—one of three flower types is categorical. This type of problem is called a classification problem.

Informally, classification asks, Is it an apple or an orange? Contrast this with machine learning regression, which asks, How many apples? By the way, the answer can be 4.5 apples for regression.

By the evolution of its design, scikit-learn addresses machine learning mainly via four categories:

Classification:
- Non-text classification, like the Iris flowers example
- Text classification
Regression
Clustering
Dimensionality reduction

You have been reading a chapter from

scikit-learn Cookbook , Second Edition - Second Edition

Published in: Nov 2017

Publisher: Packt

ISBN-13: 9781787286382

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Trent Hauck

Trent Hauck is a data scientist living and working in the Seattle area. He grew up in Wichita, Kansas and received his undergraduate and graduate degrees from the University of Kansas. He is the author of the book Instant Data Intensive Apps with pandas How-to, Packt Publishing—a book that can get you up to speed quickly with pandas and other associated technologies.

See other products by Trent Hauck