Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Python Feature Engineering Cookbook Over 70 recipes for creating, engineering, and transforming features to build machine learning models

Product type Paperback

Published in Jan 2020

Publisher Packt

ISBN-13 9781789806311

Length 372 pages

Edition 1st Edition

Languages

Python

Tools

NumPy

Concepts

Machine Learning

Author (1):

Soledad Galli

View More author details

Table of Contents (13) Chapters

Preface

1. Foreseeing Variable Problems When Building ML Models

2. Imputing Missing Data FREE CHAPTER

3. Encoding Categorical Variables

4. Transforming Numerical Variables

5. Performing Variable Discretization

6. Working with Outliers

7. Deriving Features from Dates and Time Variables

8. Performing Feature Scaling

9. Applying Mathematical Computations to Features

10. Creating Features with Transactional and Time Series Data

11. Extracting Features from Text Variables

12. Other Books You May Enjoy

Leave a review - let other readers know what you think

Technical requirements

In this chapter, we will use the pandas, NumPy, and scikit-learn Python libraries. You can get all of these libraries from the Python Anaconda distribution, which you can install by following the steps described in the Technical requirements section of Chapter 1, Foreseeing Variable Problems When Building ML Models. For the recipes in this chapter, we will use the Boston House Prices dataset from scikit-learn. To abide by machine learning best practices, we will begin each recipe by separating the data into train and test sets.

For visualizations on how the scaling techniques described in this chapter affect variable distribution, visit the accompanying Jupyter Notebooks in the dedicated GitHub repository (https://github.com/PacktPublishing/Python-Feature-Engineering-Cookbook).

...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Galli

Soledad Galli is a bestselling data science instructor, author, and open-source Python developer. As the leading instructor at Train in Data, she teaches intermediate and advanced courses in machine learning that have enrolled over 64,000 students worldwide and continue to receive positive reviews. Sole is also the developer and maintainer of the Python open-source library Feature-engine, which provides an extensive array of methods for feature engineering and selection. With extensive experience as a data scientist in finance and insurance sectors, Sole has developed and deployed machine learning models for assessing insurance claims, evaluating credit risk, and preventing fraud. She is a frequent speaker at podcasts, meetups, and webinars, sharing her expertise with the broader data science community.

See other products by Galli