Subscription

Explore Products

Best Sellers

New Releases

Books

Events

Videos

Audiobooks

Packt Hub

Free Learning

You're reading from Python Data Science Essentials A practitioner's guide covering essential data science principles, tools, and techniques

Product type Paperback

Published in Sep 2018

Publisher Packt

ISBN-13 9781789537864

Length 472 pages

Edition 3rd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Data Science

Authors (4):

Alberto Boschetti

Luca Massaron

Pietro Marinelli

Matteo Malosetti

View More author details

Table of Contents (11) Chapters

Preface

1. First Steps

2. Data Munging FREE CHAPTER

3. The Data Pipeline

4. Machine Learning

5. Visualization, Insights, and Results

6. Social Network Analysis

7. Deep Learning Beyond the Basics

8. Spark for Big Data

9. Strengthen Your Python Foundations

10. Other Books You May Enjoy

Leave a review - let other readers know what you think

Hyperparameter optimization

A machine learning hypothesis is not simply determined by the learning algorithm but also by its hyperparameters (the parameters of the algorithm that have to be fixed prior, and which cannot be learned during the training process) and the selection of variables to be used to achieve the best learned parameters.

In this section, we will explore how to extend the cross-validation approach to find the best hyperparameters that are able to generalize to our test set. We will keep on using the handwritten digits dataset offered by the Scikit-learn package. Here's a useful reminder about how to load the dataset:

In: from sklearn.datasets import load_digits
    digits = load_digits()
    X, y = digits.data, digits.target

In addition, we will keep on using support vector machines as our learning algorithm:

In: from sklearn import svm
    h = svm.SVC()
  ...

The rest of the chapter is locked

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (4)

Alberto Boschetti

Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a Ph.D. in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP) and behavioral analysis to machine learning and distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.

See other products by Alberto Boschetti

Luca Massaron

Luca Massaron is a data scientist with over a decade of experience in transforming data into high-impact, innovative artifacts, solving real-world problems, and generating value for businesses and stakeholders. He is the author of numerous bestselling books on AI, machine learning, and algorithms. Luca is also a 3x Kaggle Grandmaster who reached number 7 in the worldwide user rankings for his performance in data science competitions. Additionally, he is recognized as a Google Developer Expert (GDE) in AI, Kaggle, and the cloud.

See other products by Luca Massaron

Pietro Marinelli

Pietro Marinelli has consistently been ranked among the top data scientists in the world in the Google Artificial Intelligence platform, Kaggle. He has reached 3rd position among Italian data scientists and 214th among 91,000 data scientists around the world. Due to his work on Kaggle, he has been honored to participate as a speaker in Paris Kaggle Day, January 2019. He has been working with artificial intelligence, text analytics, and many other data science techniques for many years, and has more than 10 years experience in designing products based on data for different industries. He has produced a variety of algorithms, ranging from predictive modeling to an advanced simulation algorithm to support senior management's business decisions for a variety of multinational companies. He is currently collaborating as a reviewer for Packt, reviewing AI books. NLP has been one of the core focuses of his projects. He has developed different algorithms for text understanding and classification in different languages (including English, Spanish, Italian, Japanese, German, French, Russian, and Chinese)

See other products by Pietro Marinelli

Matteo Malosetti

Matteo Malosetti is a mathematical engineer working as a data scientist in insurance. He is passionate about NLP applications and Bayesian statistics.

See other products by Matteo Malosetti