Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Feature Engineering Made Easy Identify unique features from your dataset in order to build powerful machine learning systems

Product type Paperback

Published in Jan 2018

Publisher Packt

ISBN-13 9781787287600

Length 316 pages

Edition 1st Edition

Languages

Python

Tools

Pandas

Concepts

Machine Learning

Authors (2):

Divya Susarla

Sinan Ozdemir

View More author details

Table of Contents (10) Chapters

Preface

1. Introduction to Feature Engineering

2. Feature Understanding – What's in My Dataset? FREE CHAPTER

3. Feature Improvement - Cleaning Datasets

4. Feature Construction

5. Feature Selection

6. Feature Transformations

7. Feature Learning

8. Case Studies

9. Other Books You May Enjoy

Leave a review - let other readers know what you think

Feature Improvement - Cleaning Datasets

In the last two chapters, we have gone from talking about a basic understanding of feature engineering and how it can be used to enhance our machine learning pipelines to getting our hands dirty with datasets and evaluating and understanding the different types of data that we can encounter in the wild.

In this chapter, we will be using what we learned and taking things a step further and begin to change the datasets that we work with. Specifically, we will be starting to clean and augment our datasets. By cleaning, we will generally be referring to the process of altering columns and rows already given to us. By augmenting, we will generally refer to the processes of removing columns and adding columns to datasets. As always, our goal in all of these processes is to enhance our machine learning pipelines.

In the following chapters...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (2)

Sinan Ozdemir

Sinan is an active lecturer focusing on large language models and a former lecturer of data science at the Johns Hopkins University. He is the author of multiple textbooks on data science and machine learning including "Quick Start Guide to LLMs". Sinan is currently the founder of LoopGenius which uses AI to help people and businesses boost their sales and was previously the founder of the acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. He holds a Master's Degree in Pure Mathematics from Johns Hopkins University and is based in San Francisco.

See other products by Sinan Ozdemir

Susarla

Divya Susarla is an experienced leader in data methods, implementing and applying tactics across a range of industries and fields including investment management, social enterprise consulting, and wine marketing. She trained in data by way of specializing in Economics and Political Science at University of California, Irvine, cultivating a passion for teaching by developing an analytically based, international affairs curriculum for students through the Global Connect program. Divya is currently focused on natural language processing and generation techniques at Kylie.ai, a startup helping clients automate their customer support conversations. When she is not busy working on building Kylie.ai and writing educational content, she spends her time traveling across the globe and experimenting with new recipes at her home in Berkeley, CA.

See other products by Susarla