Packt+ | Advance your knowledge in tech

You're reading from Applied Unsupervised Learning with Python Discover hidden patterns and relationships in unstructured data with Python

Product type Paperback

Published in May 2019

Publisher

ISBN-13 9781789952292

Length 482 pages

Edition 1st Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Authors (3):

Benjamin Johnston

Christopher Kruger

Aaron Jones

View More author details

Table of Contents (12) Chapters

Applied Unsupervised Learning with Python

Preface

1. Introduction to Clustering FREE CHAPTER

2. Hierarchical Clustering

3. Neighborhood Approaches and DBSCAN

4. Dimension Reduction and PCA

5. Autoencoders

6. t-Distributed Stochastic Neighbor Embedding (t-SNE)

7. Topic Modeling

8. Market Basket Analysis

9. Hotspot Analysis

Appendix

Chapter 5: Autoencoders

Activity 8: Modeling Neurons with a ReLU Activation Function

Solution:

Import numpy and matplotlib:

import numpy as np
import matplotlib.pyplot as plt

Allow latex symbols to be used in labels:
```
plt.rc('text', usetex=True)
```
Define the ReLU activation function as a Python function:
```
def relu(x):
    return np.max((0, x))
```
Define the inputs (x) and tunable weights (theta) for the neuron. In this example, the inputs (x) will be 100 numbers linearly spaced between -5 and 5. Set theta = 1:
```
theta = 1
x = np.linspace(-5, 5, 100)
x
```
The output is as follows:
Figure 5.35: Printing the inputs
Compute the output (y):
```
y = [relu(_x * theta) for _x in x]
```

Plot the output of the neuron versus the input:

fig = plt.figure(figsize=(10, 7))
ax = fig.add_subplot(111)

ax.plot(x, y)
ax.set_xlabel('$x$', fontsize=22);
ax.set_ylabel('$h(x\Theta)$', fontsize=22);
ax.spines['left'].set_position(('data', 0));
ax.spines['top'].set_visible(False);
ax.spines['right'].set_visible(False);
ax.tick_params(axis='both', which='major', labelsize=22)

The output is as follows:

Figure 5.36: Plot of the neuron versus input

Now, set theta = 5 and recompute and store the output of the neuron:
```
theta = 5
y_2 = [relu(_x * theta) for _x in x]
```
Now, set theta = 0.2 and recompute and store the output of the neuron:
```
theta = 0.2
y_3 = [relu(_x * theta) for _x in x]
```

Plot the three different output curves of the neuron (theta = 1, theta = 5, theta = 0.2) on one graph:

fig = plt.figure(figsize=(10, 7))
ax = fig.add_subplot(111)

ax.plot(x, y, label='$\Theta=1$');
ax.plot(x, y_2, label='$\Theta=5$', linestyle=':');
ax.plot(x, y_3, label='$\Theta=0.2$', linestyle='--');
ax.set_xlabel('$x\Theta$', fontsize=22);
ax.set_ylabel('$h(x\Theta)$', fontsize=22);
ax.spines['left'].set_position(('data', 0));
ax.spines['top'].set_visible(False);
ax.spines['right'].set_visible(False);
ax.tick_params(axis='both', which='major', labelsize=22);
ax.legend(fontsize=22);

The output is as follows:

Figure 5.37: Three output curves of the neuron

In this activity, we created a model of a ReLU-based artificial neural network neuron. We can see that the output of this neuron is very different to the sigmoid activation function. There is no saturation region for values greater than 0 because it simply returns the input value of the function. In the negative direction, there is a saturation region where only 0 will be returned if the input is less than 0. The ReLU function is an extremely powerful and commonly used activation function that has shown to be more powerful than the sigmoid function in some circumstances. ReLU is often a good first-choice activation function.

Activity 9: MNIST Neural Network

Solution:

In this activity, you will train a neural network to identify images in the MNIST dataset and reinforce your skills in training neural networks:

Import pickle, numpy, matplotlib, and the Sequential and Dense classes from Keras:

import pickle
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense

Load the mnist.pkl file, which contains the first 10,000 images and corresponding labels from the MNIST dataset that are available in the accompanying source code. The MNIST dataset is a series of 28 x 28 grayscale images of handwritten digits 0 through 9. Extract the images and labels:
```
with open('mnist.pkl', 'rb') as f:
    data = pickle.load(f)
    
images = data['images']
labels = data['labels']
```

Plot the first 10 samples along with the corresponding labels:

plt.figure(figsize=(10, 7))
for i in range(10):
    plt.subplot(2, 5, i + 1)
    plt.imshow(images[i], cmap='gray')
    plt.title(labels[i])
    plt.axis('off')

The output is as follows:

Figure 5.38: First 10 samples

Encode the labels using one hot encoding:

one_hot_labels = np.zeros((images.shape[0], 10))

for idx, label in enumerate(labels):
    one_hot_labels[idx, label] = 1
    
one_hot_labels

The output is as follows:

Figure 5.39: Result of one hot encoding

Prepare the images for input into a neural network. As a hint, there are two separate steps in this process:
```
images = images.reshape((-1, 28 ** 2))
images = images / 255.
```
Construct a neural network model in Keras that accepts the prepared images, has a hidden layer of 600 units with a ReLU activation function, and an output of the same number of units as classes. The output layer uses a softmax activation function:
```
model = Sequential([
    Dense(600, input_shape=(784,), activation='relu'),
    Dense(10, activation='softmax'),
])
```
Compile the model using multiclass cross-entropy, stochastic gradient descent, and an accuracy performance metric:
```
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])
```
Train the model. How many epochs are required to achieve at least 95% classification accuracy on the training data? Let's have a look:
```
model.fit(images, one_hot_labels, epochs=20)
```
The output is as follows:
Figure 5.40: Training the model
15 epochs are required to achieve at least 95% classification accuracy on the training set.

In this example, we have measured the performance of the neural network classifier using the data that the classifier was trained with. In general, this method should not be used as it typically reports a higher level of accuracy than one should expect from the model. In supervised learning problems, there are a number of cross-validation techniques that should be used instead. As this is a book on unsupervised learning, cross-validation lies outside the scope of this book.

Activity 10: Simple MNIST Autoencoder

Solution:

Import pickle, numpy, and matplotlib, and the Model, Input, and Dense classes from Keras:

import pickle
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Model
from keras.layers import Input, Dense

Load the images from the supplied sample of the MNIST dataset that is provided with the accompanying source code (mnist.pkl):
```
with open('mnist.pkl', 'rb') as f:
    images = pickle.load(f)['images']
```
Prepare the images for input into a neural network. As a hint, there are two separate steps in this process:
```
images = images.reshape((-1, 28 ** 2))
images = images / 255.
```

Construct a simple autoencoder network that reduces the image size to 10 x 10 after the encoding stage:

input_stage = Input(shape=(784,))
encoding_stage = Dense(100, activation='relu')(input_stage)
decoding_stage = Dense(784, activation='sigmoid')(encoding_stage)
autoencoder = Model(input_stage, decoding_stage)

Compile the autoencoder using a binary cross-entropy loss function and adadelta gradient descent:
```
autoencoder.compile(loss='binary_crossentropy',
              optimizer='adadelta')
```
Fit the encoder model:
```
autoencoder.fit(images, images, epochs=100)
```
The output is as follows:
Figure 5.41: Training the model
Calculate and store the output of the encoding stage for the first five samples:
```
encoder_output = Model(input_stage, encoding_stage).predict(images[:5])
```
Reshape the encoder output to 10 x 10 (10 x 10 = 100) pixels and multiply by 255:
```
encoder_output = encoder_output.reshape((-1, 10, 10)) * 255
```
Calculate and store the output of the decoding stage for the first five samples:
```
decoder_output = autoencoder.predict(images[:5])
```
Reshape the output of the decoder to 28 x 28 and multiply by 255:
```
decoder_output = decoder_output.reshape((-1, 28, 28)) * 255
```

Plot the original image, the encoder output, and the decoder:

images = images.reshape((-1, 28, 28))
plt.figure(figsize=(10, 7))
for i in range(5):
    plt.subplot(3, 5, i + 1)
    plt.imshow(images[i], cmap='gray')
    plt.axis('off')
    plt.subplot(3, 5, i + 6)
    plt.imshow(encoder_output[i], cmap='gray')
    plt.axis('off')   
    
    plt.subplot(3, 5, i + 11)
    plt.imshow(decoder_output[i], cmap='gray')
    plt.axis('off')

The output is as follows:

Figure 5.42: The original image, the encoder output, and the decoder

So far, we have shown how a simple single hidden layer in both the encoding and decoding stage can be used to reduce the data to a lower dimension space. We can also make this model more complicated by adding additional layers to both the encoding and the decoding stages.

Activity 11: MNIST Convolutional Autoencoder

Solution:

Import pickle, numpy, matplotlib, and the Model class from keras.models and import Input, Conv2D, MaxPooling2D, and UpSampling2D from keras.layers:

import pickle
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Model
from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D

Load the data:

with open('mnist.pkl', 'rb') as f:
    images = pickle.load(f)['images']

Rescale the images to have values between 0 and 1:
```
images = images / 255.
```
We need to reshape the images to add a single depth channel for use with convolutional stages. Reshape the images to have a shape of 28 x 28 x 1:
```
images = images.reshape((-1, 28, 28, 1))
```
Define an input layer. We will use the same shape input as an image:
```
input_layer = Input(shape=(28, 28, 1,))
```

Add a convolutional stage with 16 layers or filters, a 3 x 3 weight matrix, a ReLU activation function, and using same padding, which means the output has the same length as the input image:

hidden_encoding = Conv2D(
    16, # Number of layers or filters in the weight matrix
    (3, 3), # Shape of the weight matrix
    activation='relu',
    padding='same', # How to apply the weights to the images
)(input_layer)

Add a max pooling layer to the encoder with a 2 x 2 kernel:
```
encoded = MaxPooling2D((2, 2))(hidden_encoding)
```

Add a decoding convolutional layer:

hidden_decoding = Conv2D(
    16, # Number of layers or filters in the weight matrix
    (3, 3), # Shape of the weight matrix
    activation='relu',
    padding='same', # How to apply the weights to the images
)(encoded)

Add an upsampling layer:

upsample_decoding = UpSampling2D((2, 2))(hidden_decoding)

Add the final convolutional stage, using one layer as per the initial image depth:

decoded = Conv2D(
    1, # Number of layers or filters in the weight matrix
    (3, 3), # Shape of the weight matrix
    activation='sigmoid',
    padding='same', # How to apply the weights to the images
)(upsample_decoding)

Construct the model by passing the first and last layers of the network to the Model class:
```
autoencoder = Model(input_layer, decoded)
```
Display the structure of the model:
```
autoencoder.summary()
```
The output is as follows:
Figure 5.43: Structure of model
Compile the autoencoder using a binary cross-entropy loss function and adadelta gradient descent:
```
autoencoder.compile(loss='binary_crossentropy',
              optimizer='adadelta')
```
Now, let's fit the model; again, we pass the images as the training data and as the desired output. Train for 20 epochs as convolutional networks take a lot longer to compute:
```
autoencoder.fit(images, images, epochs=20)
```
The output is as follows:
Figure 5.44: Training the model
Calculate and store the output of the encoding stage for the first five samples:
```
encoder_output = Model(input_layer, encoded).predict(images[:5])
```
Reshape the encoder output for visualization, where each image is X*Y in size:
```
encoder_output = encoder_output.reshape((-1, 14 * 14, 16))
```
Get the output of the decoder for the first five images:
```
decoder_output = autoencoder.predict(images[:5])
```

Reshape the decoder output to 28 x 28 in size:

decoder_output = decoder_output.reshape((-1, 28, 28))

Reshape the original images back to 28 x 28 in size:
```
images = images.reshape((-1, 28, 28))
```

Plot the original image, the mean encoder output, and the decoder:

plt.figure(figsize=(10, 7))
for i in range(5):
    plt.subplot(3, 5, i + 1)
    plt.imshow(images[i], cmap='gray')
    plt.axis('off')
    
    plt.subplot(3, 5, i + 6)
    plt.imshow(encoder_output[i], cmap='gray')
    plt.axis('off')   
    
    plt.subplot(3, 5, i + 11)
    plt.imshow(decoder_output[i], cmap='gray')
    plt.axis('off')

The output is as follows:

Figure 5.45: The original image, the encoder output, and the decoder

At the end of this activity, you will have developed an autoencoder comprising convolutional layers within the neural network. Note the improvements made in the decoder representations. This architecture has a significant performance benefit over fully-connected neural network layers and is extremely useful in working with image-based datasets and generating artificial data samples.