You're reading from Active Machine Learning with Python Refine and elevate data quality over quantity with active learning

Product type Paperback

Published in Mar 2024

Publisher Packt

ISBN-13 9781835464946

Length 176 pages

Edition 1st Edition

Languages

Python

Concepts

Machine Learning

Author (1):

Margaux Masson-Forsythe

View More author details

Table of Contents (13) Chapters

Preface

1. Part 1: Fundamentals of Active Machine Learning

2. Chapter 1: Introducing Active Machine Learning FREE CHAPTER

3. Chapter 2: Designing Query Strategy Frameworks

4. Chapter 3: Managing the Human in the Loop

5. Part 2: Active Machine Learning in Practice

6. Chapter 4: Applying Active Learning to Computer Vision

7. Chapter 5: Leveraging Active Learning for Big Data

8. Part 3: Applying Active Machine Learning to Real-World Projects

9. Chapter 6: Evaluating and Enhancing Efficiency

10. Chapter 7: Utilizing Tools and Packages for Active ML

11. Index

Why subscribe?

12. Other Books You May Enjoy

Sampling with EER

EER focuses on measuring the potential decrease in generalization error instead of the expected change in the model, as seen in the previous approach. The goal is to estimate the anticipated future error of a model by training it with the current labeled set and the remaining unlabeled samples. EER can be defined as follows:

Here, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi mathvariant="script">L</mml:mi></mml:math> is the pool of paired labeled data, , and is the estimated output distribution. L is a chosen loss function that measures the error between the true distribution, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>P</mml:mi><mml:mfenced separators="|"><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:mfenced></mml:math> , and the learner’s prediction, .

This involves selecting the instance that is expected to have the lowest future error (referred to as risk) for querying. This focuses active ML on reducing long-term generalization errors rather than just immediate training performance.

In other words, EER selects unlabeled data points that, when queried and learned from, are expected to significantly reduce the model’s errors on new data points from the same distribution...