What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

AI Assistant (beta) to help accelerate your learning

Imputing Missing Data

Missing data refers to the absence of values for certain observations and is an unavoidable problem in most data sources. Scikit-learn does not support missing values as input, so we need to remove observations with missing data or transform them into permitted values. The act of replacing missing data with statistical estimates of missing values is called imputation. The goal of any imputation technique is to produce a complete dataset that can be used to train machine learning models. There are multiple imputation techniques we can apply to our data. The choice of imputation technique we use will depend on whether the data is missing at random, the number of missing values, and the machine learning model we intend to use. In this chapter, we will discuss several missing data imputation techniques.

This chapter...

Key benefits

Discover solutions for feature generation, feature extraction, and feature selection

Uncover the end-to-end feature engineering process across continuous, discrete, and unstructured datasets

Implement modern feature extraction techniques using Python's pandas, scikit-learn, SciPy and NumPy libraries

Description

Feature engineering is invaluable for developing and enriching your machine learning models. In this cookbook, you will work with the best tools to streamline your feature engineering pipelines and techniques and simplify and improve the quality of your code. Using Python libraries such as pandas, scikit-learn, Featuretools, and Feature-engine, you’ll learn how to work with both continuous and discrete datasets and be able to transform features from unstructured datasets. You will develop the skills necessary to select the best features as well as the most suitable extraction techniques. This book will cover Python recipes that will help you automate feature engineering to simplify complex processes. You’ll also get to grips with different feature engineering strategies, such as the box-cox transform, power transform, and log transform across machine learning, reinforcement learning, and natural language processing (NLP) domains. By the end of this book, you’ll have discovered tips and practical solutions to all of your feature engineering problems.

Who is this book for?

This book is for machine learning professionals, AI engineers, data scientists, and NLP and reinforcement learning engineers who want to optimize and enrich their machine learning models with the best features. Knowledge of machine learning and Python coding will assist you with understanding the concepts covered in this book.

What you will learn

Simplify your feature engineering pipelines with powerful Python packages

Get to grips with imputing missing values

Encode categorical variables with a wide set of techniques

Extract insights from text quickly and effortlessly

Develop features from transactional data and time series data

Derive new features by combining existing variables

Understand how to transform, discretize, and scale your variables

Create informative variables from date and time

What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

AI Assistant (beta) to help accelerate your learning

Frequently bought together

R$272.99

R$306.99

R$245.99

Total R$ 825.97

Filter reviews by

All

Amazon verified reviews

Kevin Nov 29, 2022

As other reviews have stated the book delivers what it says it will; Python code that generates a lot of feature-engineering. I find this book to be fantastic, and Sole's work overall, as it gives life to new feature-engineering possibilities and does it fast. Long gone are the days of writing your own custom transformers or unique time-series features. This book automates a lot of that headache and will absolutely be the first reference I go to when I need to handle a new feature. I personally hadn't dealt with tsfresh prior to reading through and it brought to life instantaneous time-series features I no longer have to write scripts for. A very happy customer on that knowledge alone! Per usual, Sole continues to advance the ML community for the betterment of all.

Amazon Verified review

Muhammad Zohaib Khan Mar 31, 2021

I bought the kindle verison of book and in introduction the book was good to read but then the display is as in picture , vertically displayed erroneous text . I cannot continue within this impossible display of text . the author need to take care of these issues . Paper verison could be ok i guess i havent read whole book though as i am returning this version now !

Omar Pasha Mar 26, 2021

I was exactly what I needed to know!

P. Sebastien Dec 08, 2020

Franchement l eau chaude serait une revolution a cote de ce livre

Amazon Customer Nov 14, 2020

Thorough recollection of feature transformations to tackle multiple aspects of data quality and to extract features from different data formats, like text, time series and transactions. Great resource to have at hand when in front of a new dataset.

Python Feature Engineering Cookbook: Over 70 recipes for creating, engineering, and transforming features to build machine learning models

What do you get with eBook?

Python Feature Engineering Cookbook

Imputing Missing Data

Technical requirements

Removing observations with missing data

How to do it...

Performing mean or median imputation

Implementing mode or frequent category imputation

How to do it...

Replacing missing values with an arbitrary number

Capturing missing values in a bespoke category

How to do it...

Replacing missing values with a value at the end of the distribution

Implementing random sample imputation

How to do it...

Adding a missing value indicator variable

Getting ready

Performing multivariate imputation by chained equations

Assembling an imputation pipeline with scikit-learn

How to do it...

Assembling an imputation pipeline with Feature-engine

How to do it...

Page 1 of 13

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with eBook?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Python Feature Engineering Cookbook: Over 70 recipes for creating, engineering, and transforming features to build machine learning models

What do you get with eBook?

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with eBook?

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs