Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Artificial Intelligence with Python Cookbook
Artificial Intelligence with Python Cookbook

Artificial Intelligence with Python Cookbook: Proven recipes for applying AI algorithms and deep learning techniques using TensorFlow 2.x and PyTorch 1.6

eBook
$26.98 $29.99
Paperback
$43.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Artificial Intelligence with Python Cookbook

Advanced Topics in Supervised Machine Learning

Following the tasters with scikit-learn, Keras, and PyTorch in the previous chapter, in this chapter, we will move on to more end-to-end examples. These examples are more advanced in the sense that they include more complex transformations and model types.

We'll be predicting partner choices with sklearn, where we'll implement a lot of custom transformer steps and more complicated machine learning pipelines. We'll then predict house prices in PyTorch and visualize feature and neuron importance. After that, we will perform active learning to decide customer values together with online learning in sklearn. In the well-known case of repeat offender prediction, we'll build a model without racial bias. Last, but not least, we'll forecast time series of CO2 levels.

Online learning in this context (as opposed to internet-based learning) refers to a model update strategy that incorporates training data that comes in sequentially. This can be useful in cases where the dataset is very big (often the case with images, videos, and texts) or where it's important to keep the model up to date given the changing nature of the data.

In many of these recipes, we've shortened the description to the most salient details in order to highlight particular concepts. For the full details, please refer to the notebooks on GitHub.

In this chapter, we'll be covering the following recipes:

  • Transforming data in scikit-learn
  • Predicting house prices in PyTorch
  • Live decisioning customer values
  • Battling algorithmic bias
  • Forecasting CO2 time series

Technical requirements

Transforming data in scikit-learn

In this recipe, we will be building more complex pipelines using mixed-type columnar data. We'll use a speed dating dataset that was published in 2006 by Fisman et al.: https://doi.org/10.1162/qjec.2006.121.2.673

Perhaps this recipe will be informative in more ways than one, and we'll learn something useful about the mechanics of human mating choices.

The dataset description on the OpenML website reads as follows:

This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute first date with every other participant of the opposite sex. At the end of their 4 minutes, participants were asked whether they would like to see their date again. They were also asked to rate their date on six attributes: attractiveness, sincerity, intelligence, fun, ambition, and shared interests. The dataset also includes questionnaire data gathered from participants at different points in the process. These fields include demographics, dating habits, self-perception across key attributes, beliefs in terms of what others find valuable in a mate, and lifestyle information.

The problem is to predict mate choices from what we know about participants and their matches. This dataset presents some challenges that can serve an illustrative purpose:

  • It contains 123 different features, of different types:
    • Categorical
    • Numerical
    • Range features

It also contains the following:

  • Some missing values
  • Target imbalance

On the way to solving this problem of predicting mate choices, we will build custom encoders in scikit-learn and a pipeline comprising all features and their preprocessing steps.

The primary focus in this recipe will be on pipelines and transformers. In particular, we will build a custom transformer for working with range features and another one for numerical features.

Getting ready

We'll need the following libraries for this recipe. They are as follows:

  • OpenML to download the dataset
  • openml_speed_dating_pipeline_steps to use our custom transformer
  • imbalanced-learn to work with imbalanced classes
  • shap to show us the importance of features

In order to install them, we can use pip again:

pip install -q openml openml_speed_dating_pipeline_steps==0.5.5 imbalanced_learn category_encoders shap
OpenML is an organization that intends to make data science and machine learning reproducible and therefore more conducive to research. The OpenML website not only hosts datasets, but also allows the uploading of machine learning results to public leaderboards under the condition that the implementation relies solely on open source. These results and how they were obtained can be inspected in complete detail by anyone who's interested.

In order to retrieve the data, we will use the OpenML Python API. The get_dataset() method will download the dataset; with get_data(), we can get pandas DataFrames for features and target, and we'll conveniently get the information on categorical and numerical feature types:

import openml
dataset = openml.datasets.get_dataset(40536)
X, y, categorical_indicator, _ = dataset.get_data(
dataset_format='DataFrame',
target=dataset.default_target_attribute
)
categorical_features = list(X.columns[categorical_indicator]) numeric_features = list(
X.columns[[not(i) for i in categorical_indicator]]
)
In the original version of the dataset, as presented in the paper, there was a lot more work to do. However, the version of the dataset on OpenML already has missing values represented as numpy.nan, which lets us skip this conversion. You can see this preprocessor on GitHub if you are interested: https://github.com/benman1/OpenML-Speed-Dating

Alternatively, you can use a download link from the OpenML dataset web page at https://www.openml.org/data/get_csv/13153954/speeddating.arff.

With the dataset loaded, and the libraries installed, we are ready to start cracking.

How to do it...

Pipelines are a way of describing how machine learning algorithms, including preprocessing steps, can follow one another in a sequence of transformations on top of the raw dataset before applying a final predictor. We will see examples of these concepts in this recipe and throughout this book.

A few things stand out pretty quickly looking at this dataset. We have a lot of categorical features. So, for modeling, we will need to encode them numerically, as in the Modeling and predicting in Keras recipe in Chapter 1, Getting Started with Artificial Intelligence in Python.

Encoding ranges numerically

Some of these are actually encoded ranges. This means these are ordinal, in other words, they are categories that are ordered; for example, the d_interests_correlate feature contains strings like these:

[[0-0.33], [0.33-1], [-1-0]]

If we were to treat these ranges as categorical variables, we'd lose the information about the order, and we would lose information about how different two values are. However, if we convert them to numbers, we will keep this information and we would be able to apply other numerical transformations on top.

We are going to implement a transformer to plug into an sklearn pipeline in order to convert these range features to numerical features. The basic idea of the conversion is to extract the upper and lower bounds of these ranges as follows:

def encode_ranges(range_str):
splits = range_str[1:-1].split('-')
range_max = splits[-1]
range_min = '-'.join(splits[:-1])
return range_min, range_max

examples = X['d_interests_correlate'].unique()
[encode_ranges(r) for r in examples]

We'll see this for our example:

[('0', '0.33'), ('0.33', '1'), ('-1', '0')]

In order to get numerical features, we can then take the mean between the two bounds. As we've mentioned before, on OpenML, not only are results shown, but also the models are transparent. Therefore, if we want to submit our model, we can only use published modules. We created a module and published it in the pypi Python package repository, where you can find the package with the complete code: https://pypi.org/project/openml-speed-dating-pipeline-steps/.

Here is the simplified code for RangeTransformer:

from sklearn.base import BaseEstimator, TransformerMixin
import category_encoders.utils as util

class RangeTransformer(BaseEstimator, TransformerMixin):
def __init__(self, range_features=None, suffix='_range/mean', n_jobs=-1):
assert isinstance(range_features, list) or range_features is None
self.range_features = range_features
self.suffix = suffix
self.n_jobs = n_jobs

def fit(self, X, y=None):
return self

def transform(self, X, y=None):
X = util.convert_input(X)
if self.range_features is None:
self.range_features = list(X.columns)

range_data = pd.DataFrame(index=X.index)
for col in self.range_features:
range_data[str(col) + self.suffix] = pd.to_numeric(
self._vectorize(X[col])
)
self.feature_names = list(range_data.columns)
return range_data

def _vectorize(self, s):
return Parallel(n_jobs=self.n_jobs)(
delayed(self._encode_range)(x) for x in s
)

@staticmethod
@lru_cache(maxsize=32)
def _encode_range(range_str):
splits = range_str[1:-1].split('-')
range_max = float(splits[-1])
range_min = float('-'.join(splits[:-1]))
return sum([range_min, range_max]) / 2.0

def get_feature_names(self):
return self.feature_names

This is a shortened snippet of the custom transformer for ranges. Please see the full implementation on GitHub at https://github.com/benman1/OpenML-Speed-Dating.

Please pay attention to how the fit() and transform() methods are used. We don't need to do anything in the fit() method, because we always apply the same static rule. The transfer() method applies this rule. We've seen the examples previously. What we do in the transfer() method is to iterate over columns. This transformer also shows the use of the parallelization pattern typical to scikit-learn. Additionally, since these ranges repeat a lot, and there aren't so many, we'll use a cache so that, instead of doing costly string transformations, the range value can be retrieved from memory once the range has been processed once.

An important thing about custom transformers in scikit-learn is that they should inherit from BaseEstimator and TransformerMixin, and implement the fit() and transform() methods. Later on, we will require get_feature_names() so we can find out the names of the features generated.

Deriving higher-order features

Let's implement another transformer. As you may have noticed, we have different types of features that seem to refer to the same personal attributes:

  • Personal preferences
  • Self-assessment
  • Assessment of the other person

It seems clear that differences between any of these features could be significant, such as the importance of sincerity versus how sincere someone assesses a potential partner. Therefore, our next transformer is going to calculate the differences between numerical features. This is supposed to help highlight these differences.

These features are derived from other features, and combine information from two (or potentially more features). Let's see what the NumericDifferenceTransformer feature looks like:

import operator

class NumericDifferenceTransformer(BaseEstimator, TransformerMixin):
def __init__(self, features=None,
suffix='_numdist', op=operator.sub, n_jobs=-1
):
assert isinstance(
features, list
) or features is None
self.features = features
self.suffix = suffix
self.op = op
self.n_jobs = n_jobs

def fit(self, X, y=None):
X = util.convert_input(X)
if self.features is None:
self.features = list(
X.select_dtypes(include='number').columns
)
return self

def _col_name(self, col1, col2):
return str(col1) + '_' + str(col2) + self.suffix

def _feature_pairs(self):
feature_pairs = []
for i, col1 in enumerate(self.features[:-1]):
for col2 in self.features[i+1:]:
feature_pairs.append((col1, col2))
return feature_pairs

def transform(self, X, y=None):
X = util.convert_input(X)

feature_pairs = self._feature_pairs()
columns = Parallel(n_jobs=self.n_jobs)(
delayed(self._col_name)(col1, col2)
for col1, col2 in feature_pairs
)
data_cols = Parallel(n_jobs=self.n_jobs)(
delayed(self.op)(X[col1], X[col2])
for col1, col2 in feature_pairs
)
data = pd.concat(data_cols, axis=1)
data.rename(
columns={i: col for i, col in enumerate(columns)},
inplace=True, copy=False
)
data.index = X.index
return data

def get_feature_names(self):
return self.feature_names

This is a custom transformer that calculates differences between numerical features. Please refer to the full implementation in the repository of the OpenML-Speed-Dating library at https://github.com/benman1/OpenML-Speed-Dating.

This transformer has a very similar structure to RangeTransformer. Please note the parallelization between columns. One of the arguments to the __init__() method is the function that is used to calculate the difference. This is operator.sub() by default. The operator library is part of the Python standard library and implements basic operators as functions. The sub() function does what it sounds like:

import operator
operator.sub(1, 2) == 1 - 2
# True

This gives us a prefix or functional syntax for standard operators. Since we can pass functions as arguments, this gives us the flexibility to specify different operators between columns.

The fit() method this time just collects the names of numerical columns, and we'll use these names in the transform() method.

Combining transformations

We will put these transformers together with ColumnTransformer and the pipeline. However, we'll need to make the association between columns and their transformations. We'll define different groups of columns:

range_cols = [
col for col in X.select_dtypes(include='category')
if X[col].apply(lambda x: x.startswith('[')
if isinstance(x, str) else False).any()
]
cat_columns = list(
set(X.select_dtypes(include='category').columns) - set(range_cols)
)
num_columns = list(
X.select_dtypes(include='number').columns
)

Now we have columns that are ranges, columns that are categorical, and columns that are numerical, and we can assign pipeline steps to them.

In our case, we put this together as follows, first in a preprocessor:

from imblearn.ensemble import BalancedRandomForestClassifier
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import FunctionTransformer
import category_encoders as ce
import openml_speed_dating_pipeline_steps as pipeline_steps

preprocessor = ColumnTransformer(
transformers=[
('ranges', Pipeline(steps=[
('impute', pipeline_steps.SimpleImputerWithFeatureNames(strategy='constant', fill_value=-1)),
('encode', pipeline_steps.RangeTransformer())
]), range_cols),
('cat', Pipeline(steps=[
('impute', pipeline_steps.SimpleImputerWithFeatureNames(strategy='constant', fill_value='-1')),
('encode', ce.OneHotEncoder(
cols=None, # all features that it given by ColumnTransformer
handle_unknown='ignore',
use_cat_names=True
)
)
]), cat_columns),
('num', pipeline_steps.SimpleImputerWithFeatureNames(strategy='median'), num_columns),
],
remainder='drop', n_jobs=-1
)

And then we'll put the preprocessing in a pipeline, together with the estimator:

def create_model(n_estimators=100):
return Pipeline(
steps=[('preprocessor', preprocessor),
('numeric_differences', pipeline_steps.NumericDifferenceTransformer()),
('feature_selection', SelectKBest(f_classif, k=20)),
('rf', BalancedRandomForestClassifier(
n_estimators=n_estimators,
)
)]
)

Here is the performance in the test set:

from sklearn.metrics import roc_auc_score, confusion_matrix
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
X, y,
test_size=0.33,
random_state=42,
stratify=y
)
clf = create_model(50)
clf.fit(X_train, y_train)
y_predicted = clf.predict(X_test)
auc = roc_auc_score(y_test, y_predicted)
print('auc: {:.3f}'.format(auc))

We get the following performance as an output:

auc: 0.779

This is a very good performance, as you can see comparing it to the leaderboard on OpenML.

How it works...

It is time to explain basic scikit-learn terminology relevant to this recipe. Neither of these concepts corresponds to existing machine learning algorithms, but to composable modules:

  • Transformer (in scikit-learn): A class that is derived from sklearn.base.TransformerMixin; it has fit() and transform() methods. These involve preprocessing steps or feature selection.
  • Predictor: A class that is derived from either sklearn.base.ClassifierMixin or sklearn.base.RegressorMixin; it has fit() and predict() methods. These are machine learning estimators, in other words, classifiers or regressors.
  • Pipeline: An interface that wraps all steps together and gives you a single interface for all steps of the transformation and the resulting estimator. A pipeline again has fit() and predict() methods.

There are a few things to point out regarding our approach. As we said before, we have missing values, so we have to impute (meaning replace) missing values with other values. In this case, we replace missing values with -1. In the case of categorical variables, this will be a new category, while in the case of numerical variables, it will become a special value that the classifier will have to handle.

ColumnTransformer came with version 0.20 of scikit-learn and was a long-awaited feature. Since then, ColumnTransformer can often be seen like this, for example:

from sklearn.compose import ColumnTransformer, make_column_transformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder

feature_preprocessing = make_column_transformer(
(StandardScaler(), ['column1', 'column2']),
(OneHotEncoder(), ['column3', 'column4', 'column5'])
)

feature_preprocessing can then be used as usual with the fit(), transform(), and fit_transform() methods:

processed_features = feature_preprocessing.fit_transform(X)

Here, X means our features.

Alternatively, we can put ColumnTransformer as a step into a pipeline, for example, like this:

from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression

make_pipeline
( feature_preprocessing, LogisticRegression()
)

Our classifier is a modified form of the random forest classifier. A random forest is a collection of decision trees, each trained on random subsets of the training data. The balanced random forest classifier (Chen et al.: https://statistics.berkeley.edu/sites/default/files/tech-reports/666.pdf) makes sure that each random subset is balanced between the two classes.

Since NumericDifferenceTransformer can provide lots of features, we will incorporate an extra step of model-based feature selection.

There's more...

You can see the complete example with the speed dating dataset, a few more custom transformers, and an extended imputation class in the GitHub repository of the openml_speed_dating_pipeline_steps library and notebook, on GitHub at https://github.com/PacktPublishing/Artificial-Intelligence-with-Python-Cookbook/blob/master/chapter02/Transforming%20Data%20in%20Scikit-Learn.ipynb.

Both RangeTransformer and NumericDifferenceTransformer could also have been implemented using FunctionTransformer in scikit-learn.

ColumnTransformer is especially handy for pandas DataFrames or NumPy arrays since it allows the specification of different operations for different subsets of the features. However, another option is FeatureUnion, which allows concatenation of the results from different transformations. For yet another way to chain our operations together, please have a look at PandasPicker in our repository.

See also

In this recipe, we used ANOVA f-values for univariate feature selection, which is relatively simple, yet effective. Univariate feature selection methods are usually simple filters or statistical tests that measure the relevance of a feature with regard to the target. There are, however, many different methods for feature selection, and scikit-learn implements a lot of them: https://scikit-learn.org/stable/modules/feature_selection.html.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Get up and running with artificial intelligence in no time using hands-on problem-solving recipes
  • Explore popular Python libraries and tools to build AI solutions for images, text, sounds, and images
  • Implement NLP, reinforcement learning, deep learning, GANs, Monte-Carlo tree search, and much more

Description

Artificial intelligence (AI) plays an integral role in automating problem-solving. This involves predicting and classifying data and training agents to execute tasks successfully. This book will teach you how to solve complex problems with the help of independent and insightful recipes ranging from the essentials to advanced methods that have just come out of research. Artificial Intelligence with Python Cookbook starts by showing you how to set up your Python environment and taking you through the fundamentals of data exploration. Moving ahead, you’ll be able to implement heuristic search techniques and genetic algorithms. In addition to this, you'll apply probabilistic models, constraint optimization, and reinforcement learning. As you advance through the book, you'll build deep learning models for text, images, video, and audio, and then delve into algorithmic bias, style transfer, music generation, and AI use cases in the healthcare and insurance industries. Throughout the book, you’ll learn about a variety of tools for problem-solving and gain the knowledge needed to effectively approach complex problems. By the end of this book on AI, you will have the skills you need to write AI and machine learning algorithms, test them, and deploy them for production.

Who is this book for?

This AI machine learning book is for Python developers, data scientists, machine learning engineers, and deep learning practitioners who want to learn how to build artificial intelligence solutions with easy-to-follow recipes. You’ll also find this book useful if you’re looking for state-of-the-art solutions to perform different machine learning tasks in various use cases. Basic working knowledge of the Python programming language and machine learning concepts will help you to work with code effectively in this book.

What you will learn

  • Implement data preprocessing steps and optimize model hyperparameters
  • Delve into representational learning with adversarial autoencoders
  • Use active learning, recommenders, knowledge embedding, and SAT solvers
  • Get to grips with probabilistic modeling with TensorFlow probability
  • Run object detection, text-to-speech conversion, and text and music generation
  • Apply swarm algorithms, multi-agent systems, and graph networks
  • Go from proof of concept to production by deploying models as microservices
  • Understand how to use modern AI in practice

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Oct 30, 2020
Length: 468 pages
Edition : 1st
Language : English
ISBN-13 : 9781789137965
Category :
Languages :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Oct 30, 2020
Length: 468 pages
Edition : 1st
Language : English
ISBN-13 : 9781789137965
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 132.97
40 Algorithms Every Programmer Should Know
$49.99
Artificial Intelligence with Python Cookbook
$43.99
Python Machine Learning by Example
$38.99
Total $ 132.97 Stars icon

Table of Contents

12 Chapters
Getting Started with Artificial Intelligence in Python Chevron down icon Chevron up icon
Advanced Topics in Supervised Machine Learning Chevron down icon Chevron up icon
Patterns, Outliers, and Recommendations Chevron down icon Chevron up icon
Probabilistic Modeling Chevron down icon Chevron up icon
Heuristic Search Techniques and Logical Inference Chevron down icon Chevron up icon
Deep Reinforcement Learning Chevron down icon Chevron up icon
Advanced Image Applications Chevron down icon Chevron up icon
Working with Moving Images Chevron down icon Chevron up icon
Deep Learning in Audio and Speech Chevron down icon Chevron up icon
Natural Language Processing Chevron down icon Chevron up icon
Artificial Intelligence in Production Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.9
(7 Ratings)
5 star 85.7%
4 star 14.3%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Isaac Dec 03, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
There are a number of survey books on the market for doing AI with python. I own many of them. Auffarth's approach here is unique in that it takes a do then explain the approach. The chapters are well structured with cookbook examples followed by in-depth explanations. What the author calls 'how to do it' then 'how it works.' The breath covered in the book is also impressive. everything from setting up a basic environment to deploying models into production. Applications range from typical deep learning stuff (like predicting housing prices) to image processing, audio applications, and fraud detection. This has become my latest 'go-to' book when I want to explore a new AI topic. I would recommend it to any data scientist looking to take their skills to the next level.
Amazon Verified review Amazon
Eric D. Weber Apr 26, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book has the right mixture of breadth and depth, which is extremely challenging in the (growing) group of AI literature. What I enjoy the most is that you can both learn a lot of Python and/or learn a lot about ML and Experimentation, depending on your focus. Maybe most important, it allows those newer or intermediate in the field to scale up quickly. For beginners and practitioners, there is a lot of value to be found in the 400+ pages.
Amazon Verified review Amazon
Lucinda Linde Feb 19, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
"Example isn’t another way to teach, it is the only way to teach." ---Albert EinsteinDisclaimer: This review has been requested by the publisher, and I am giving my honest review of this book. This review is based on reading the book. As with any Cookbook, the proof is in the pudding. I intend to use some of these recipes on my own data sets to see if the techniques help me get better results.OverviewArtificial Intelligence with Python Cookbook by Ben Auffarth, is a jam-packed masterclass in applying AI to a selection of important business problems. The book will be helpful to someone who had made many models, and wants to up their game to using more sophisticated AI.What I love about this book:The examples are very relevant. There are over 20 examples which range from detecting anomalies as in Fraud Detection to Time Series to Customer Lifetime Value to modelling spread of disease and many more. These are all applications that have proven business or social ROI.Chapter 1 is filled with helpful tips to set up Google Colab and Jupyter Notebook, as well as ways to write better code. These include speeding up and parallelizing code, putting in progress bars, auto-reloading packages and more. There are also really useful visualization techniques such as correlation heat maps and pair plot. These are fast ways to get a bird's-eye view of exploratory data analysis (EDA).Each chapter is its own guidebook to using a particular technique. For example, I recently did a project on Time Series and got ok results. Chapter 2 gives me additional ways to utilize SARIMAX, which I will try out. There are also scripts using FBProphet, which may be a better way to address Time Series. The code examples are in the book, but also on GitHub for easy copy-and-pasting. The book explains the rationale behind the code.The examples used from chapter to chapter build on each other. The examples start with the basic, such as the iris classification problem, and model using a variety of neural networks. There are chapters on NLP, machine vision, audio files and even deploying AI in production.The visualizations are very helpful to understanding the insights that are being generated from the code. Having the code to make those visualizations is extremely helpful.I will use this book as a reference and starting point for trying out techniques I haven't before. Can't wait to try out the examples in the book.What I don't like about the bookFor topics that I am really unprepared to absorb, the book gives links to material that can cover the basics. It makes me wonder if this book could perhaps be the springboard to multiple books each of which takes a few techniques, and brings the reader up to speed more gently, with examples of increasing complexity so the core concepts can be learned.Overall, "Artificial Intelligence with Python Cookbook" is very well thought out and a fantastic one place to start upping one's game in applying AI to a variety of valuable applications.
Amazon Verified review Amazon
Andreas Mueller Dec 09, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I've been impressed by the wide overview of the book, which really spans the gamut of what AI means, from classification to search algorithms and A/B testing. The book focuses on some standard tools but also branches out to surface some lesser known libraries that can come in handy.While 468 pages can only give a taste of each topic, the book is jam-packed with examples and serves as a good starting point with plenty of references.
Amazon Verified review Amazon
Murphy Choy Aug 26, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book is a great recipe book for folks who want quick results without going through reams of content. I love the easy to read structure of this book.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.