Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Feature Engineering Made Easy Identify unique features from your dataset in order to build powerful machine learning systems

Product type Paperback

Published in Jan 2018

Publisher Packt

ISBN-13 9781787287600

Length 316 pages

Edition 1st Edition

Languages

Python

Tools

Pandas

Concepts

Machine Learning

Authors (2):

Divya Susarla

Sinan Ozdemir

View More author details

Table of Contents (10) Chapters

Preface

1. Introduction to Feature Engineering

2. Feature Understanding – What's in My Dataset? FREE CHAPTER

3. Feature Improvement - Cleaning Datasets

4. Feature Construction

5. Feature Selection

6. Feature Transformations

7. Feature Learning

8. Case Studies

9. Other Books You May Enjoy

Leave a review - let other readers know what you think

Dealing with missing values in a dataset

When working with data, one of the most common issues a data scientist will run into is the problem of missing data. Most commonly, this refers to empty cells (row/column intersections) where the data just was not acquired for whatever reason. This can become a problem for many reasons; notably, when applying learning algorithms to data with missing values, most (not all) algorithms are not able to cope with missing values.

For this reason, data scientists and machine learning engineers have many tricks and tips on how to deal with this problem. Although there are many variations of methodologies, the two major ways in which we can deal with missing data are:

Remove rows with missing values in them
Impute (fill in) missing values

Each method will clean our dataset to a point where a learning algorithm can handle it, but each method...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (2)

Sinan Ozdemir

Sinan is an active lecturer focusing on large language models and a former lecturer of data science at the Johns Hopkins University. He is the author of multiple textbooks on data science and machine learning including "Quick Start Guide to LLMs". Sinan is currently the founder of LoopGenius which uses AI to help people and businesses boost their sales and was previously the founder of the acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. He holds a Master's Degree in Pure Mathematics from Johns Hopkins University and is based in San Francisco.

See other products by Sinan Ozdemir

Susarla

Divya Susarla is an experienced leader in data methods, implementing and applying tactics across a range of industries and fields including investment management, social enterprise consulting, and wine marketing. She trained in data by way of specializing in Economics and Political Science at University of California, Irvine, cultivating a passion for teaching by developing an analytically based, international affairs curriculum for students through the Global Connect program. Divya is currently focused on natural language processing and generation techniques at Kylie.ai, a startup helping clients automate their customer support conversations. When she is not busy working on building Kylie.ai and writing educational content, she spends her time traveling across the globe and experimenting with new recipes at her home in Berkeley, CA.

See other products by Susarla