Chapter 8: Tips and Tricks of the Trade
Activity 21: Classifying Images using InceptionV3
Solution:
- Create functions to get images and labels. Here PATH variable contains the path to the training dataset.
from PIL import Image
def get_input(file):
    return Image.open(PATH+file)
def get_output(file):
    class_label = file.split('.')[0]
    if class_label == 'dog': label_vector = [1,0]
    elif class_label == 'cat': label_vector = [0,1]
    return label_vector
- Set SIZE and CHANNELS. SIZE is the dimension of the square image input. CHANNELS is the number of channels in the training data images. There are 3 channels in a RGB image.
SIZE = 200
CHANNELS = 3
- Create a function to preprocess and augment images:
def preprocess_input(image):
   Â
    # Data preprocessing
    image = image.resize((SIZE,SIZE))
    image = np.array(image).reshape(SIZE,SIZE,CHANNELS)
   Â
    # Normalize image
    image = image/255.0
   Â
    return image
- Finally, develop the generator that will generate the batches:
import numpy as np
def custom_image_generator(images, batch_size = 128):
   Â
    while True:
        # Randomly select images for the batch
        batch_images = np.random.choice(images, size = batch_size)
        batch_input = []
        batch_output = []
       Â
        # Read image, perform preprocessing and get labels
        for file in batch_images:
            # Function that reads and returns the image
            input_image = get_input(file)
            # Function that gets the label of the image
            label = get_output(file)
            # Function that pre-processes and augments the image
            image = preprocess_input(input_image)
Â
            batch_input.append(image)
            batch_output.append(label)
Â
        batch_x = np.array(batch_input)
        batch_y = np.array(batch_output)
Â
        # Return a tuple of (images,labels) to feed the network
        yield(batch_x, batch_y)
- Next, we will read the validation data. Create a function to read the images and their labels:
from tqdm import tqdm
def get_data(files):
    data_image = []
    labels = []
    for image in tqdm(files):
        label_vector = get_output(image)
       Â
        img = Image.open(PATH + image)
        img = img.resize((SIZE,SIZE))
       Â
        labels.append(label_vector)
        img = np.asarray(img).reshape(SIZE,SIZE,CHANNELS)
        img = img/255.0
        data_image.append(img)
       Â
    data_x = np.array(data_image)
    data_y = np.array(labels)
       Â
    return (data_x, data_y)
- Read the validation files:
import os
files = os.listdir(PATH)
random.shuffle(files)
train = files[:7000]
test = files[7000:]
validation_data = get_data(test)
7. Plot a few images from the dataset to see whether you loaded the files correctly:
import matplotlib.pyplot as plt
plt.figure(figsize=(20,10))
columns = 5
for i in range(columns):
    plt.subplot(5 / columns + 1, columns, i + 1)
    plt.imshow(validation_data[0][i])
A random sample of the images is shown here:
Figure 8.16: Sample images from the loaded dataset
- Load the Inception model and pass the shape of the input images:
from keras.applications.inception_v3 import InceptionV3
base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(SIZE,SIZE,CHANNELS))
- Add the output dense layer according to our problem:
from keras.layers import GlobalAveragePooling2D, Dense, Dropout
from keras.models import Model
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(2, activation='softmax')(x)
Â
model = Model(inputs=base_model.input, outputs=predictions)
- Next, compile the model to make it ready for training:
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics = ['accuracy'])
And then perform the training of the model:
EPOCHS = 50
BATCH_SIZE = 128
Â
model_details = model.fit_generator(custom_image_generator(train, batch_size = BATCH_SIZE),
                    steps_per_epoch = len(train) // BATCH_SIZE,
                    epochs = EPOCHS,
                    validation_data= validation_data,
                    verbose=1)
- Evaluate the model and get the accuracy:
score = model.evaluate(validation_data[0], validation_data[1])
print("Accuracy: {0:.2f}%".format(score[1]*100))
The accuracy is as follows:
Figure 8.17: Model accuracy
Activity 22: Using Transfer Learning to Predict Images
Solution:
- First, set the random number seed so that the results are reproducible:
from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(1)
- Set SIZE and CHANNELS
SIZE is the dimension of the square image input. CHANNELS is the number of channels in the training data images. There are 3 channels in a RGB image.
SIZE = 200
CHANNELS = 3
- Create functions to get images and labels. Here PATH variable contains the path to the training dataset.
from PIL import Image
def get_input(file):
    return Image.open(PATH+file)
def get_output(file):
    class_label = file.split('.')[0]
    if class_label == 'dog': label_vector = [1,0]
    elif class_label == 'cat': label_vector = [0,1]
    return label_vector
- Create a function to preprocess and augment images:
def preprocess_input(image):
   Â
    # Data preprocessing
    image = image.resize((SIZE,SIZE))
    image = np.array(image).reshape(SIZE,SIZE,CHANNELS)
   Â
    # Normalize image
    image = image/255.0
   Â
    return image
- Finally, create the generator that will generate the batches:
import numpy as np
def custom_image_generator(images, batch_size = 128):
   Â
    while True:
        # Randomly select images for the batch
        batch_images = np.random.choice(images, size = batch_size)
        batch_input = []
        batch_output = []
       Â
        # Read image, perform preprocessing and get labels
        for file in batch_images:
            # Function that reads and returns the image
            input_image = get_input(file)
            # Function that gets the label of the image
            label = get_output(file)
            # Function that pre-processes and augments the image
            image = preprocess_input(input_image)
Â
            batch_input.append(image)
            batch_output.append(label)
Â
        batch_x = np.array(batch_input)
        batch_y = np.array(batch_output)
Â
        # Return a tuple of (images,labels) to feed the network
        yield(batch_x, batch_y)
- Next, we will read the development and test data. Create a function to read the images and their labels:
from tqdm import tqdm
def get_data(files):
    data_image = []
    labels = []
    for image in tqdm(files):
       Â
        label_vector = get_output(image)
       Â
Â
        img = Image.open(PATH + image)
        img = img.resize((SIZE,SIZE))
       Â
      Â
        labels.append(label_vector)
        img = np.asarray(img).reshape(SIZE,SIZE,CHANNELS)
        img = img/255.0
        data_image.append(img)
       Â
    data_x = np.array(data_image)
    data_y = np.array(labels)
       Â
    return (data_x, data_y)
- Now read the development and test files. The split for the train/dev/test set is 70%/15%/15%.
import random
random.shuffle(files)
train = files[:7000]
development = files[7000:8500]
test = files[8500:]
development_data = get_data(development)
test_data = get_data(test)
- Plot a few images from the dataset to see whether you loaded the files correctly:
import matplotlib.pyplot as plt
plt.figure(figsize=(20,10))
columns = 5
for i in range(columns):
    plt.subplot(5 / columns + 1, columns, i + 1)
    plt.imshow(validation_data[0][i])
Check the output in the following screenshot:
Figure 8.18: Sample images from the loaded dataset
- Load the Inception model and pass the shape of the input images:
from keras.applications.inception_v3 import InceptionV3
base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(200,200,3))
10. Add the output dense layer according to our problem:
from keras.models import Model
from keras.layers import GlobalAveragePooling2D, Dense, Dropout
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(256, activation='relu')(x)
keep_prob = 0.5
x = Dropout(rate = 1 - keep_prob)(x)
predictions = Dense(2, activation='softmax')(x)
Â
model = Model(inputs=base_model.input, outputs=predictions)
- This time around, we will freeze the first five layers of the model to help with the training time:
for layer in base_model.layers[:5]:
    layer.trainable = False
- Compile the model to make it ready for training:
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics = ['accuracy'])
- Create callbacks for Keras:
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping, TensorBoard
callbacks = [
    TensorBoard(log_dir='./logs',
                update_freq='epoch'),
    EarlyStopping(monitor = "val_loss",
                 patience = 18,
                 verbose = 1,
                 min_delta = 0.001,
                 mode = "min"),
    ReduceLROnPlateau(monitor = "val_loss",
                     factor = 0.2,
                     patience = 8,
                     verbose = 1,
                     mode = "min"),
    ModelCheckpoint(monitor = "val_loss",
                   filepath = "Dogs-vs-Cats-InceptionV3-{epoch:02d}-{val_loss:.2f}.hdf5",
                   save_best_only=True,
                   period = 1)]
Note
Here, we are making use of four callbacks: TensorBoard, EarlyStopping, ReduceLROnPlateau, and ModelCheckpoint.
Perform training on the model. Here we train our model for 50 epochs only and with a batch size of 128:
EPOCHS = 50
BATCH_SIZE = 128
model_details = model.fit_generator(custom_image_generator(train, batch_size = BATCH_SIZE),
                   steps_per_epoch = len(train) // BATCH_SIZE,
                   epochs = EPOCHS,
                   callbacks = callbacks,
                   validation_data= development_data,
                   verbose=1)
The training logs on TensorBoard are shown here:
Figure 8.19: Training set logs from TensorBoard
- You can now fine-tune the hyperparameters taking accuracy of the development set as the metric.
The logs of the development set from the TensorBoard tool are shown here:
Figure 8.20: Validation set logs from TensorBoard
The learning rate decrease can be observed from the following plot:
Figure 8.21: Learning rate log from TensorBoard
- Evaluate the model on the test set and get the accuracy:
score = model.evaluate(test_data[0], test_data[1])
print("Accuracy: {0:.2f}%".format(score[1]*100))
To understand fully, refer to the following output screenshot:
Figure 8.22: The final accuracy of the model on the test set
As you can see, the model gets an accuracy of 93.6% on the test set, which is different from the accuracy of the development set (93.3% from the TensorBoard training logs). The early stopping callback stopped training when there wasn't a significant improvement in the loss of the development set; this helped us save some time. The learning rate was reduced after nine epochs, which helped training, as can be seen here: