Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Active Machine Learning with Python

You're reading from   Active Machine Learning with Python Refine and elevate data quality over quantity with active learning

Arrow left icon
Product type Paperback
Published in Mar 2024
Publisher Packt
ISBN-13 9781835464946
Length 176 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Margaux Masson-Forsythe Margaux Masson-Forsythe
Author Profile Icon Margaux Masson-Forsythe
Margaux Masson-Forsythe
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Part 1: Fundamentals of Active Machine Learning
2. Chapter 1: Introducing Active Machine Learning FREE CHAPTER 3. Chapter 2: Designing Query Strategy Frameworks 4. Chapter 3: Managing the Human in the Loop 5. Part 2: Active Machine Learning in Practice
6. Chapter 4: Applying Active Learning to Computer Vision 7. Chapter 5: Leveraging Active Learning for Big Data 8. Part 3: Applying Active Machine Learning to Real-World Projects
9. Chapter 6: Evaluating and Enhancing Efficiency 10. Chapter 7: Utilizing Tools and Packages for Active ML 11. Index 12. Other Books You May Enjoy

Sampling with EER

EER focuses on measuring the potential decrease in generalization error instead of the expected change in the model, as seen in the previous approach. The goal is to estimate the anticipated future error of a model by training it with the current labeled set and the remaining unlabeled samples. EER can be defined as follows:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><msub><mi>E</mi><msub><mover><mi>P</mi><mo stretchy="true">ˆ</mo></mover><mi mathvariant="script">L</mi></msub></msub><mo>=</mo><mrow><msub><mo>∫</mo><mi>x</mi></msub><mrow><mi>L</mi><mfenced open="(" close=")"><mrow><mi>P</mi><mfenced open="(" close=")"><mrow><mi>y</mi><mo>|</mo><mi>x</mi></mrow></mfenced><mo>,</mo><mover><mi>P</mi><mo stretchy="true">ˆ</mo></mover><mfenced open="(" close=")"><mrow><mi>y</mi><mo>|</mo><mi>x</mi></mrow></mfenced></mrow></mfenced><mi>P</mi><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mrow></mrow></mrow></math>

Here, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi mathvariant="script">L</mml:mi></mml:math> is the pool of paired labeled data, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mi>P</mml:mi><mml:mfenced separators="|"><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:mfenced></mml:math>, and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:msub><mml:mfenced separators="|"><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:mfenced></mml:math> is the estimated output distribution. L is a chosen loss function that measures the error between the true distribution, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mi>P</mml:mi><mml:mfenced separators="|"><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:mfenced></mml:math>, and the learner’s prediction, <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:msub><mml:mfenced separators="|"><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:mfenced></mml:math>.

This involves selecting the instance that is expected to have the lowest future error (referred to as risk) for querying. This focuses active ML on reducing long-term generalization errors rather than just immediate training performance.

In other words, EER selects unlabeled data points that, when queried and learned from, are expected to significantly reduce the model’s errors on new data points from the same distribution...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at R$50/month. Cancel anytime