Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Free Learning

You're reading from Python Data Science Essentials A practitioner's guide covering essential data science principles, tools, and techniques

Product type Paperback

Published in Sep 2018

Publisher Packt

ISBN-13 9781789537864

Length 472 pages

Edition 3rd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Data Science

Authors (2):

Luca Massaron

Alberto Boschetti

View More author details

Table of Contents (11) Chapters

Preface

1. First Steps

2. Data Munging FREE CHAPTER

3. The Data Pipeline

4. Machine Learning

5. Visualization, Insights, and Results

6. Social Network Analysis

7. Deep Learning Beyond the Basics

8. Spark for Big Data

9. Strengthen Your Python Foundations

10. Other Books You May Enjoy

Leave a review - let other readers know what you think

Preparing tools and datasets

As introduced in the previous chapters, the Python package for machine learning with the lion's share is scikit-learn. In this chapter, we also will use XGboost, LightGBM, and Catboost: you'll find the instructions in the relevant sections.

The motivations for using scikit-learn developed at Inria, the French Institute for Research in Computer Science and Automation (inria.fr/en/), are multiple. It is worthwhile at this point to mention the most important reasons for using scikit-learn for the success of your data science project:

A consistent API (fit, predict, transform, and partial_fit) across models that naturally helps to correctly implement data science procedures working on data organized in NumPy arrays
A complete selection of well-tested and scalable classical models for machine learning, offering many out-of-core implementations...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (2)

Alberto Boschetti

Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a Ph.D. in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP) and behavioral analysis to machine learning and distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.

See other products by Alberto Boschetti

Luca Massaron

Having joined Kaggle over 10 years ago, Luca Massaron is a Kaggle Grandmaster in discussions and a Kaggle Master in competitions and notebooks. In Kaggle competitions he reached no. 7 in the worldwide rankings. On the professional side, Luca is a data scientist with more than a decade of experience in transforming data into smarter artifacts, solving real-world problems, and generating value for businesses and stakeholders. He is a Google Developer Expert(GDE) in machine learning and the author of best-selling books on AI, machine learning, and algorithms.

See other products by Luca Massaron