Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Artificial Intelligence with Python Cookbook

You're reading from   Artificial Intelligence with Python Cookbook Proven recipes for applying AI algorithms and deep learning techniques using TensorFlow 2.x and PyTorch 1.6

Arrow left icon
Product type Paperback
Published in Oct 2020
Publisher Packt
ISBN-13 9781789133967
Length 468 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Ritesh Kumar Ritesh Kumar
Author Profile Icon Ritesh Kumar
Ritesh Kumar
Ben Auffarth Ben Auffarth
Author Profile Icon Ben Auffarth
Ben Auffarth
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Getting Started with Artificial Intelligence in Python 2. Advanced Topics in Supervised Machine Learning FREE CHAPTER 3. Patterns, Outliers, and Recommendations 4. Probabilistic Modeling 5. Heuristic Search Techniques and Logical Inference 6. Deep Reinforcement Learning 7. Advanced Image Applications 8. Working with Moving Images 9. Deep Learning in Audio and Speech 10. Natural Language Processing 11. Artificial Intelligence in Production 12. Other Books You May Enjoy

Live decisioning customer values

Let's assume we have the following scenario: we have a list of customers to call in order to sell them our product. Each phone call costs money in call center personal salaries, so we want to reduce these costs as much as possible. We have certain information about each customer that could help us determine whether they are likely to buy. After every call, we can update our model. The main goal is to call only the most promising customers and to improve our insights into which customers are more likely to pay for our product.

In this recipe, we will approach this with active learning, a strategy where we actively decide what to explore (and learn) next. Our model will help decide whom to call. Because we will update our model after each query (phone call), we will use online learning models.

Getting ready

We'll prepare for our recipe by downloading our dataset and installing a few libraries.

Again, we will get the data from OpenML:

!pip install -q openml

import openml
dataset = openml.datasets.get_dataset(1461)
X, y, categorical_indicator, _ = dataset.get_data(
dataset_format='DataFrame',
target=dataset.default_target_attribute
)
categorical_features = X.columns[categorical_indicator]
numeric_features = X.columns[
[not(i) for i in categorical_indicator]
]

This dataset is called bank-marketing, and you can see a description on OpenML at https://www.openml.org/d/1461.

For each row, describing a single person, we have different features, numerical and categorical, that tell us about demographics and customer history.

To model the likelihood of customers signing up for our product, we will use the scikit-multiflow package that specializes in online models. We will also use the category_encoders package again:

!pip install scikit-multiflow category_encoders

With these two libraries in place, we can start the recipe.

How to do it...

We need to implement an exploration strategy and a model that is being continuously updated. We are using the online version of the random forest, the Hoeffding Tree, as our model. We are estimating the uncertainties at every step, and based on that we will return a candidate to call next.

As always, we will need to define a few preprocessing steps:

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import FunctionTransformer
import category_encoders as ce

ordinal_encoder = ce.OrdinalEncoder(
cols=None, # all features that it encounters
handle_missing='return_nan',
handle_unknown='ignore'
).fit(X)

preprocessor = ColumnTransformer(
transformers=[
('cat', ordinal_encoder, categorical_features),
('num', FunctionTransformer(validate=False), numeric_features)
])
preprocessor = preprocessor.fit(X)

Then we come to our active learning approach itself. This is inspired by modAL.models.ActiveLearner:

import numpy as np
from skmultiflow.trees.hoeffding_tree import HoeffdingTreeClassifier
from sklearn.metrics import roc_auc_score
import random

class ActivePipeline:
def __init__(self, model, preprocessor, class_weights):
self.model = model
self.preprocessor = preprocessor
self.class_weights = class_weights

@staticmethod
def values(X):
if isinstance(X, (np.ndarray, np.int64)):
return X
else:
return X.values

def preprocess(self, X):
X_ = pd.DataFrame(
data=self.values(X),
columns=[
'V1', 'V2', 'V3', 'V4',
'V5', 'V6', 'V7', 'V8',
'V9', 'V10', 'V11', 'V12',
'V13', 'V14', 'V15', 'V16'
])
return self.preprocessor.transform(X_)

def fit(self, X, ys):
weights = [self.class_weights[y] for y in ys]
self.model.fit(self.preprocess(X), self.values(ys))

def update(self, X, ys):
if isinstance(ys, (int, float)):
weight = self.class_weights[y]
else:
weight = [self.class_weights[y] for y in ys]

self.model.partial_fit(
self.preprocess(X),
self.values(ys),
weight
)

def predict(self, X):
return self.model.predict(
self.preprocess(X)
)

def predict_proba(self, X):
return self.model.predict_proba(
self.preprocess(X)
)

@staticmethod
def entropy(preds):
return -np.sum(
np.log((preds + 1e-15) * preds)
/ np.log(np.prod(preds.size))
)

def max_margin_uncertainty(self, X, method: str='entropy',
exploitation: float=0.9, favor_class: int=1, k: int=1
):
'''similar to modAL.uncertainty.margin_uncertainty
'''
probs = self.predict_proba(X)
if method=='margin':
uncertainties = np.abs(probs[:,2] - probs[:, 1]) / 2.0
elif method=='entropy':
uncertainties = np.apply_along_axis(self.entropy, 1, probs[:, (1,2)])
else: raise(ValueError('method not implemented!'))

if favor_class is None:
weights = uncertainties
else: weights = (1.0 - exploitation) * uncertainties + exploitation * probs[:, favor_class]

if self.sampling:
ind = random.choices(
range(len(uncertainties)), weights, k=k
)
else:
ind = np.argsort(weights, axis=0)[::-1][:k]
return ind, np.mean(uncertainties[ind])

def score(self, X, y, scale=True):
probs = self.predict_proba(X, probability=2)
if scale:
probs = np.clip(probs - np.mean(probs) + 0.5, 0, 1)
return roc_auc_score(y, probs)

Again, we create a scikit-learn-compatible class. It basically holds a machine learning model and a data preprocessor. We implement fit() and predict(), but also score() to get a model performance. We also implement an update() method that calls partial_fit() of the machine learning model. Calling partial_fit() instead of fit() considerably speeds up the computations, because we don't have to start from scratch every time we get new data.

Here's how to create the active learning pipeline:

active_pipeline = ActivePipeline(
HoeffdingTreeClassifier(),
preprocessor,
class_weights.to_dict()
)
active_pipeline.model.classes = [0, 1, 2]

We can run different simulations on our dataset with this setup. For example, we can compare a lot of experimentation (0.5 exploitation) against only exploitation (1.0), or no learning at all after the first batch. We basically go through a loop:

  • Via active_pipeline.max_margin_uncertainty(), we present data to the active pipeline, and get a number of data points that integrate uncertainty and target predictions according to our preferences of the integration method.
  • Once we get the actual results for these data points, we can update our model: active_pipeline.update().

You can see an example of this in the notebook on GitHub.

We can see that curious wins out after the first few examples. Exploitation is actually the least successful scheme. By not updating the model, performance deteriorates over time:

This is an ideal scenario for active learning or reinforcement learning, because, not unlike in reinforcement learning, uncertainty can be an additional criterion, apart from positive expectation, from a customer. Over time, this entropy reduction-seeking behavior reduces as the model's understanding of customers improves.

How it works...

It's worth delving a bit more into a few of the concepts and strategies employed in this recipe.

Active learning

Active learning means that we can actively query for more information; in other words, exploration is part of our strategy. This can be useful in scenarios where we have to actively decide what to learn, and where what we learn influences not only how much our model learns and how well, but also how much return on an investment we can get.

Hoeffding Tree

The Hoeffding Tree (also known as the Very Fast Decision Tree, VFDT for short) was introduced in 2001 by Geoff Hulten and others (Mining time-changing data streams). It is an incrementally growing decision tree for streamed data. Tree nodes are expanded based on the Hoeffding bound (or additive Chernoff bound). It was theoretically shown that, given sufficient training data, a model learned by the Hoeffding tree converges very closely to the one built by a non-incremental learner.

The Hoeffding bound is defined as follows:

It's important to note that the Hoeffding Tree doesn't deal with data distributions that change over time.

Class weighting

Since we are dealing with an imbalanced dataset, let's use class weights. This basically means that we are upsampling the minority (signing up) class and downsampling the majority class (not signing up).

The formula for the class weights is as follows:

Similarly, in Python, we can write the following:

class_weights = len(X) / (y.astype(int).value_counts() * 2)

We can then use these class weights for sampling.

We'll close the recipe with a few more pointers.

See also

Only a few models in scikit-learn allow incremental or online learning. Refer to the list at https://scikit-learn.org/stable/modules/computing.html.

A few linear models include the partial_fit() method. The scikit-multiflow library specializes in incremental and online/streaming models: https://scikit-multiflow.github.io/

You can find more resources and ideas regarding active learning from a recent review that concentrates on biomedical image processing (Samuel Budd and others, A Survey on Active Learning and Human-in-the-Loop Deep Learning for Medical Image Analysis, 2019; https://arxiv.org/abs/1910.02923).

Our approach is inspired by the modalAI Python active learning package, which you can find at https://modal-python.readthedocs.io/. We recommend you check it out if you are interested in active learning approaches. A few more Python packages are available, as follows:

One of the main decisions in active learning is the trade-off between exploration and exploitation. You can find out more about this in a paper called Exploration versus exploitation in active learning: a Bayesian approach: http://www.vincentlemaire-labs.fr/publis/ijcnn_2_2010_camera_ready.pdf

You have been reading a chapter from
Artificial Intelligence with Python Cookbook
Published in: Oct 2020
Publisher: Packt
ISBN-13: 9781789133967
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime