You're reading from TensorFlow 2.0 Computer Vision Cookbook Implement machine learning solutions to overcome various computer vision challenges

Product type Paperback

Published in Feb 2021

Publisher Packt

ISBN-13 9781838829131

Length 542 pages

Edition 1st Edition

Languages

Python

Tools

OpenCV

Concepts

Computer Vision

Author (1):

Jesús Martínez

View More author details

Table of Contents (14) Chapters

Preface

1. Chapter 1: Getting Started with TensorFlow 2.x for Computer Vision

2. Chapter 2: Performing Image Classification FREE CHAPTER

3. Chapter 3: Harnessing the Power of Pre-Trained Networks with Transfer Learning

4. Chapter 4: Enhancing and Styling Images with DeepDream, Neural Style Transfer, and Image Super-Resolution

5. Chapter 5: Reducing Noise with Autoencoders

6. Chapter 6: Generative Models and Adversarial Attacks

7. Chapter 7: Captioning Images with CNNs and RNNs

8. Chapter 8: Fine-Grained Understanding of Images through Segmentation

9. Chapter 9: Localizing Elements in Images with Object Detection

10. Chapter 10: Applying the Power of Deep Learning to Videos

11. Chapter 11: Streamlining Network Implementation with AutoML

12. Chapter 12: Boosting Performance

13. Other Books You May Enjoy

Leave a review - let other readers know what you think

Creating a binary classifier to detect smiles

In its most basic form, image classification consists of discerning between two classes, or signaling the presence or absence of some trait. In this recipe, we'll implement a binary classifier that tells us whether a person in a photo is smiling.

Let's begin, shall we?

Getting ready

You'll need to install Pillow, which is very easy with pip:

$> pip install Pillow

We'll use the SMILEs dataset, located here: https://github.com/hromi/SMILEsmileD. Clone or download a zipped version of the repository to a location of your preference. In this recipe, we assume the data is inside the ~/.keras/datasets directory, under the name SMILEsmileD-master:

Figure 2.1 – Positive (left) and negative (right) examples

Let's get started!

How to do it…

Follow these steps to train a smile classifier from scratch on the SMILEs dataset:

Import all necessary packages:

import os
import pathlib
import glob
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.keras import Model
from tensorflow.keras.layers import *
from tensorflow.keras.preprocessing.image import *

Define a function to load the images and labels from a list of file paths:

def load_images_and_labels(image_paths):
    images = []
    labels = []
    for image_path in image_paths:
        image = load_img(image_path, target_size=(32,32), 
                         color_mode='grayscale')
        image = img_to_array(image)
        label = image_path.split(os.path.sep)[-2]
        label = 'positive' in label
        label = float(label)
        images.append(image)
        labels.append(label)
    return np.array(images), np.array(labels)

Notice that we are loading the images in grayscale, and we're encoding the labels by checking whether the word positive is in the file path of the image.

Define a function to build the neural network. This model's structure is based on LeNet (you can find a link to LeNet's paper in the See also section):

def build_network():
    input_layer = Input(shape=(32, 32, 1))
    x = Conv2D(filters=20,
               kernel_size=(5, 5),
               padding='same',
               strides=(1, 1))(input_layer)
    x = ELU()(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D(pool_size=(2, 2),
                     strides=(2, 2))(x)
    x = Dropout(0.4)(x)
    x = Conv2D(filters=50,
               kernel_size=(5, 5),
               padding='same',
               strides=(1, 1))(x)
    x = ELU()(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D(pool_size=(2, 2),
                     strides=(2, 2))(x)
    x = Dropout(0.4)(x)
    x = Flatten()(x)
    x = Dense(units=500)(x)
    x = ELU()(x)
    x = Dropout(0.4)(x)
    output = Dense(1, activation='sigmoid')(x)
    model = Model(inputs=input_layer, outputs=output)
    return model

Because this is a binary classification problem, a single Sigmoid-activated neuron is enough in the output layer.

Load the image paths into a list:

files_pattern = (pathlib.Path.home() / '.keras' / 
                 'datasets' /
                 'SMILEsmileD-master' / 'SMILEs' / '*' 
                    / '*' / 
                 '*.jpg')
files_pattern = str(files_pattern)
dataset_paths = [*glob.glob(files_pattern)]

Use the load_images_and_labels() function defined previously to load the dataset into memory:
```
X, y = load_images_and_labels(dataset_paths)
```
Normalize the images and compute the number of positive, negative, and total examples in the dataset:
```
X /= 255.0
total = len(y)
total_positive = np.sum(y)
total_negative = total - total_positive
```

Create train, test, and validation subsets of the data:

(X_train, X_test,
 y_train, y_test) = train_test_split(X, y,
                                     test_size=0.2,
                                     stratify=y,
                                     random_state=999)
(X_train, X_val,
 y_train, y_val) = train_test_split(X_train, y_train,
                                    test_size=0.2,
                                    stratify=y_train,
                                    random_state=999)

Instantiate the model and compile it:

model = build_network()
model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

Train the model. Because the dataset is unbalanced, we are assigning weights to each class proportional to the number of positive and negative images in the dataset:

BATCH_SIZE = 32
EPOCHS = 20
model.fit(X_train, y_train,
          validation_data=(X_val, y_val),
          epochs=EPOCHS,
          batch_size=BATCH_SIZE,
          class_weight={
              1.0: total / total_positive,
              0.0: total / total_negative
          })

Evaluate the model on the test set:

test_loss, test_accuracy = model.evaluate(X_test, 
                                          y_test)

After 20 epochs, the network should get around 90% accuracy on the test set. In the following section, we'll explain the previous steps.

How it works…

We just trained a network to determine whether a person is smiling or not in a picture. Our first big task was to take the images in the dataset and load them into a format suitable for our neural network. Specifically, the load_image_and_labels() function is in charge of loading an image in grayscale, resizing it to 32x32x1, and then converting it into a numpy array. To extract the label, we looked at the containing folder of each image: if it contained the word positive, we encoded the label as 1; otherwise, we encoded it as 0 (a trick we used here was casting a Boolean as a float, like this: float(label)).

Next, we built the neural network, which is inspired by the LeNet architecture. The biggest takeaway here is that because this is a binary classification problem, we can use a single Sigmoid-activated neuron to discern between the two classes.

We then took 20% of the images to comprise our test set, and from the remaining 80% we took an additional 20% to create our validation set. With these three subsets in place, we proceeded to train the network over 20 epochs, using binary_crossentropy as our loss function and rmsprop as the optimizer.

To account for the imbalance in the dataset (out of the 13,165 images, only 3,690 contain smiling people, while the remaining 9,475 do not), we passed a class_weight dictionary where we assigned a weight conversely proportional to the number of instances of each class in the dataset, effectively forcing the model to pay more attention to the 1.0 class, which corresponds to smile.

Finally, we achieved around 90.5% accuracy on the test set.

You're reading from TensorFlow 2.0 Computer Vision Cookbook Implement machine learning solutions to overcome various computer vision challenges

Table of Contents (14) Chapters

Creating a binary classifier to detect smiles

Getting ready

How to do it…

How it works…

See also

Authors (1)

Other recommended products

Personalised recommendations for you