You're reading from Artificial Intelligence with Python Your complete guide to building intelligent apps using Python 3.x

Product type Paperback

Published in Jan 2020

Publisher Packt

ISBN-13 9781839219535

Length 618 pages

Edition 2nd Edition

Languages

Python

Tools

TensorFlow

Concepts

Artificial Intelligence

Authors (2):

Alberto Artasanchez

Joshi

View More author details

Table of Contents (26) Chapters

Preface

1. Introduction to Artificial Intelligence

2. Fundamental Use Cases for Artificial Intelligence FREE CHAPTER

3. Machine Learning Pipelines

4. Feature Selection and Feature Engineering

5. Classification and Regression Using Supervised Learning

6. Predictive Analytics with Ensemble Learning

7. Detecting Patterns with Unsupervised Learning

8. Building Recommender Systems

9. Logic Programming

10. Heuristic Search Techniques

11. Genetic Algorithms and Genetic Programming

12. Artificial Intelligence on the Cloud

13. Building Games with Artificial Intelligence

14. Building a Speech Recognizer

15. Natural Language Processing

16. Chatbots

17. Sequential Data and Time Series Analysis

18. Image Recognition

19. Neural Networks

20. Deep Learning with Convolutional Neural Networks

21. Recurrent Neural Networks and Other Deep Learning Models

22. Creating Intelligent Agents with Reinforcement Learning

23. Artificial Intelligence and Big Data

24. Other Books You May Enjoy

25. Index

Dealing with class imbalance

A classifier is only as good as the data that is used for training. A common problem faced in the real world is issues with data quality. For a classifier to perform well, it needs to see an equal number of points for each class. But when data is collected in the real world, it's not always possible to ensure that each class has the exact same number of data points. If one class has 10 times the number of data points than another class, then the classifier tends to get biased towards the more numerous class. Hence, we need to make sure that we account for this imbalance algorithmically. Let's see how to do that.

Create a new Python file and import the following packages:

import sys

import numpy as np
import matplotlib.pyplot as plt
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report