Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

The Data Science Workshop

You're reading from The Data Science Workshop Learn how you can build machine learning models and create your own real-world data science projects

Product type Paperback

Published in Aug 2020

Publisher Packt

ISBN-13 9781800566927

Length 824 pages

Edition 2nd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Data Science

Authors (5):

Robert Thas John

Thomas Joseph

Anthony So

Dr. Samuel Asare

Andrew Worsley

+1 more

View More author details

Table of Contents (16) Chapters

Preface

1. Introduction to Data Science in Python

2. Regression FREE CHAPTER

3. Binary Classification

4. Multiclass Classification with RandomForest

5. Performing Your First Cluster Analysis

6. How to Assess Performance

7. The Generalization of Machine Learning Models

8. Hyperparameter Tuning

9. Interpreting a Machine Learning Model

10. Analyzing a Dataset

11. Data Preparation

12. Feature Engineering

Introduction

13. Imbalanced Datasets

14. Dimensionality Reduction

15. Ensemble Learning

Challenges of Imbalanced Datasets

As seen from the classifier example, one of the biggest challenges with imbalanced datasets is the bias toward the majority class, which ended up being 88% in the previous example. This will result in suboptimal results. However, what makes such cases even more challenging is the deceptive nature of results if the right metric is not used.

Let's take, for example, a dataset where the negative class is around 99% and the positive class is 1% (as in a use case where a rare disease has to be detected, for instance).

Have a look at the following code snippet:

Data set Size: 10,000 examples
Negative class : 9910
Positive Class : 90

Suppose we had a poor classifier that was capable of only predicting the negative class; we would get the following confusion matrix:

Figure 13.10: Confusion matrix of the poor classifier

From the confusion matrix, let's calculate the accuracy measures. Have a look at the following...

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (5)

Anthony So

Anthony So

Anthony So is a renowned leader in data science. He has extensive experience in solving complex business problems using advanced analytics and AI in different industries including financial services, media, and telecommunications. He is currently the chief data officer of one of the most innovative fintech start-ups. He is also the author of several best-selling books on data science, machine learning, and deep learning. He has won multiple prizes at several hackathon competitions, such as Unearthed, GovHack, and Pepper Money. Anthony holds two master's degrees, one in computer science and the other in data science and innovation.

See other products by Anthony So

Thomas Joseph

Thomas Joseph

Thomas V. Joseph is a data science practitioner, researcher, trainer, mentor, and writer with more than 19 years of experience. He has extensive experience in solving business problems using machine learning toolsets across multiple industry segments.

See other products by Thomas Joseph

Dr. Samuel Asare

Dr. Samuel Asare

Dr. Samuel Asare is a professional engineer with enthusiasm for Python programming, research, and writing. He is highly skilled in applying data science methods to the extraction of useful insights from large data sets. He possesses solid skills in project management processes. Samuel has previously held positions, in industry and academia, as a process engineer and a lecturer of materials science and engineering respectively. Presently, he is pursuing his passion for solving industry problems, using data science methods, and writing.

See other products by Dr. Samuel Asare

Andrew Worsley

Andrew Worsley

Andrew David Worsley is an independent consultant and educator with expertise in the areas of machine learning, statistics, cloud computing, and artificial intelligence. He has practiced data science in several countries across a multitude of industries including retail, financial services, marketing, resources, and healthcare.

See other products by Andrew Worsley

Robert Thas John

Robert Thas John

Robert Thas John is a data engineer with a career that spans two decades. He manages a team of data engineers, analysts, and machine learning engineers – roles that he has held in the past. He leads a number of efforts aimed at increasing the adoption of machine learning on embedded devices through various programs from Google Developers and ARM Ltd, which licenses the chips found in Arduinos and other microcontrollers. He started his career as a software engineer with work that has spanned various industries. His first experience with embedded systems was in programming payment terminals.

See other products by Robert Thas John

Other recommended products

Related to this chapter

The Data Science Workshop

The Data Science Workshop

Cut through the noise and get real results with a step-by-step approach to data science

Jan 2020 27h 16m

The Data Science Workshop

The Data Science Workshop

Cut through the noise and get real results with a step-by-step approach to data science

Jan 2020 27h 16m

The Data Science Workshop

The Data Science Workshop

Cut through the noise and get real results with a step-by-step approach to data science

Jan 2020 27h 16m

The Data Science Workshop

The Data Science Workshop

Cut through the noise and get real results with a step-by-step approach to data science

Jan 2020 27h 16m

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook

Feature engineering is invaluable for developing and enriching your machine learning models. In this book, you will work with the best Python tools to streamline your feature engineering pipelines, feature engineering techniques and simplify and improve the quality of your code.

Jan 2020 12h 24m

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook

Feature engineering is invaluable for developing and enriching your machine learning models. In this book, you will work with the best Python tools to streamline your feature engineering pipelines, feature engineering techniques and simplify and improve the quality of your code.

Jan 2020 12h 24m

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook

Feature engineering is invaluable for developing and enriching your machine learning models. In this book, you will work with the best Python tools to streamline your feature engineering pipelines, feature engineering techniques and simplify and improve the quality of your code.

Jan 2020 12h 24m

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook

Feature engineering is invaluable for developing and enriching your machine learning models. In this book, you will work with the best Python tools to streamline your feature engineering pipelines, feature engineering techniques and simplify and improve the quality of your code.

Jan 2020 12h 24m

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook

Feature engineering is invaluable for developing and enriching your machine learning models. In this book, you will work with the best Python tools to streamline your feature engineering pipelines, feature engineering techniques and simplify and improve the quality of your code.

Jan 2020 12h 24m

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook

Feature engineering is invaluable for developing and enriching your machine learning models. In this book, you will work with the best Python tools to streamline your feature engineering pipelines, feature engineering techniques and simplify and improve the quality of your code.

Jan 2020 12h 24m

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook

Feature engineering is invaluable for developing and enriching your machine learning models. In this book, you will work with the best Python tools to streamline your feature engineering pipelines, feature engineering techniques and simplify and improve the quality of your code.

Jan 2020 12h 24m

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook

Feature engineering is invaluable for developing and enriching your machine learning models. In this book, you will work with the best Python tools to streamline your feature engineering pipelines, feature engineering techniques and simplify and improve the quality of your code.

Jan 2020 12h 24m

Personalised recommendations for you

Based on your interests and search pattern

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m

Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

This book provides a hands-on approach to solving over 30 prominent real-world computer vision problems using PyTorch 2.x on actual datasets. Here you'll learn to build a neural network from scratch and optimize hyperparameters, perform image classification, multi-object detection, segmentation, and more. You'll also explore facial expression manipulation and combining CV with NLP and RL techniques, build generative AI applications, and take your model to production on AWS. By the end of this book, you'll master modern NN architectures and confidently solve real-world CV problems.

Jun 2024 24h 52m