You're reading from Python Feature Engineering Cookbook Over 70 recipes for creating, engineering, and transforming features to build machine learning models

Product type Paperback

Published in Oct 2022

Publisher Packt

ISBN-13 9781804611302

Length 386 pages

Edition 2nd Edition

Languages

Python

Tools

Combine

Concepts

Machine Learning

Author (1):

Soledad Galli

View More author details

Table of Contents (14) Chapters

Preface

1. Chapter 1: Imputing Missing Data

2. Chapter 2: Encoding Categorical Variables FREE CHAPTER

3. Chapter 3: Transforming Numerical Variables

4. Chapter 4: Performing Variable Discretization

5. Chapter 5: Working with Outliers

6. Chapter 6: Extracting Features from Date and Time Variables

7. Chapter 7: Performing Feature Scaling

8. Chapter 8: Creating New Features

9. Chapter 9: Extracting Features from Relational Data with Featuretools

10. Chapter 10: Creating Features from a Time Series with tsfresh

11. Chapter 11: Extracting Features from Text Variables

12. Index

Why subscribe?

13. Other Books You May Enjoy

What this book covers

Chapter 1, Imputing Missing Data, discusses various techniques to fill in missing values with estimates of missing data that are suitable for numerical and categorical features.

Chapter 2, Encoding Categorical Variables, introduces various widely used techniques to transform categorical variables into numbers. It starts by describing commonly used methods such as one-hot and ordinal encoding, then it moves on to domain-specific methods such as the weight of the evidence, and finally, it shows you how to encode variables that are highly cardinal.

Chapter 3, Transforming Numerical Variables, explain when we need to transform variables for use in machine learning models and then discusses common transformations and their suitability, based on variable characteristics.

Chapter 4, Performing Variable Discretization, introduces discretization and when it is useful, and then moves on to describe various discretization methods and their advantages and limitations. It covers the basic equal-with and equal-frequency discretization procedures, as well as discretization using decision trees and k-means.

Chapter 5, Working with Outliers, shows commonly used methods to remove outliers from the variables. You will learn how to detect outliers, how to cap variables at a given arbitrary value, and how to remove outliers.

Chapter 6, Extracting Features from Date and Time, describes how to create features from dates and time variables. It covers how to extract date and time components from features, as well as how to combine datetime variables and how to work with different time zones.

Chapter 7, Performing Feature Scaling, covers methods to put the variables on a similar scale. It discusses standardization, how to scale to maximum and minimum values, and how to perform more robust forms of variable scaling.

Chapter 8, Creating New Features, describes multiple methods with which we can combine existing variables to create new features. It shows the use of mathematical operations and also decision trees to create variables from two or more existing features.

Chapter 9, Extracting Features from Relational Data with Featuretools, introduces relational datasets and then moves on to explain how we can create features at different data aggregation levels, utilizing Featuretools. You will learn how to automatically create dozens of features from numerical and categorical variables, datetime, and text.

Chapter 10, Creating Features from Time Series with tsfresh, discusses how to automatically create several hundreds of features from time series data, for use in supervised classification or regression. You will learn how to automatically create and select relevant features from your time series with tsfresh.

Chapter 11, Extracting Features from Text Variables, covers simple methods to clean and extract value from short pieces of text. You will learn how to count words, sentences, characters, and lexical diversity. You will discover how to clean your text pieces and how to create feature matrices by counting words.

The rest of the chapter is locked

You're reading from Python Feature Engineering Cookbook Over 70 recipes for creating, engineering, and transforming features to build machine learning models

Table of Contents (14) Chapters

What this book covers

Authors (1)

Personalised recommendations for you

You're reading from Python Feature Engineering Cookbook Over 70 recipes for creating, engineering, and transforming features to build machine learning models

Table of Contents (14) Chapters

What this book covers

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you