[box type="note" align="" class="" width=""]This article has been extracted from the book Principles of Data Science authored by Sinan Ozdemir. With a unique approach that bridges the gap between mathematics and computer science, the books takes you through the entire data science pipeline. Beginning with cleaning and preparing data, and effective data mining strategies and techniques to help you get to grips with machine learning.[/box]
In this article, we’re going to learn how to create a neural network whose goal will be to classify images.
Tensorflow is an open-source machine learning module that is used primarily for its simplified deep learning and neural network abilities. I would like to take some time to introduce the module and solve a few quick problems using tensorflow.
Let’s begin with some imports:
from sklearn import datasets, metrics
import tensorflow as tf
import numpy as np
from sklearn.cross_validation import train_test_split
%matplotlib inline
Loading our iris dataset:
# Our data set of iris flowers
iris = datasets.load_iris()
# Load datasets and split them for training and testing
X_train, X_test, y_train, y_test = train_test_split(iris.data,
iris.
target)
Creating the Neural Network:
# Specify that all features have real-value datafeature_columns = [tf.contrib.layers.real_valued_column("",
dimension=4)] optimizer = tf.train.GradientDescentOptimizer(learning_rate=.1) # Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
optimizer=optimizer,
n_classes=3)
# Fit model.
classifier.fit(x=X_train,
y=y_train,
steps=2000)
Notice that our code really hasn't changed from the last segment. We still have our feature_columns from before, but now we introduce, instead of a linear classifier, a DNNClassifier, which stands for Deep Neural Network Classifier.
This is TensorFlow's syntax for implementing a neural network. Let's take a closer look:
tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
optimizer=optimizer,
n_classes=3)
We see that we are inputting the same feature_columns, n_classes, and optimizer, but see how we have a new parameter called hidden_units? This list represents the number of nodes to have in each layer between the input and the output layer. All in all, this neural network will have five layers:
Now that we've trained our model, let's evaluate it on our test set:
# Evaluate accuracy.
accuracy_score = classifier.evaluate(x=X_test,
y=y_test)["accuracy"]
print('Accuracy: {0:f}'.format(accuracy_score))
Accuracy: 0.921053
Hmm, our neural network didn't do so well on this dataset, but perhaps it is because the network is a bit too complicated for such a simple dataset. Let's introduce a new dataset that has a bit more to it…
The MNIST dataset consists of over 50,000 handwritten digits (0-9) and the goal is to recognize the handwritten digits and output which letter they are writing. Tensorflow has a built-in mechanism for downloading and loading these images.
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=False)
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Notice that one of our inputs for downloading mnist is called one_hot. This parameter either brings in the dataset's target variable (which is the digit itself) as a single number or has a dummy variable.
For example, if the first digit were a 7, the target would either be:
We will encode our target the former way, as this is what our tensorflow neural network and our sklearn logistic regression will expect.
The dataset is split up already into a training and test set, so let's create new variables to hold them:
x_mnist = mnist.train.images
y_mnist = mnist.train.labels.astype(int)
For the y_mnist variable, I specifically cast every target as an integer (by default they come in as floats) because otherwise tensorflow would throw an error at us.
Out of curiosity, let's take a look at a single image:
import matplotlib.pyplot as plt
plt.imshow(x_mnist[10].reshape(28, 28))
And hopefully our target variable matches at the 10th index as well:
y_mnist[10]
0
Excellent! Let's now take a peek at how big our dataset is:
x_mnist.shape
(55000, 784)
y_mnist.shape
(55000,)
Our training size then is 55000 images and target variables.
Let's fit a deep neural network to our images and see if it will be able to pick up on the patterns in our inputs:
# Specify that all features have real-value data
feature_columns = [tf.contrib.layers.real_valued_column("",
dimension=784)]
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.1)
# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
optimizer=optimizer,
n_classes=10)
# Fit model.
classifier.fit(x=x_mnist,
y=y_mnist,
steps=1000)
# Warning this is veryyyyyyyy slow
This code is very similar to our previous segment using DNNClassifier; however, look how in our first line of code, I have changed the number of columns to be 784 while in the classifier itself, I changed the number of output classes to be 10. These are manual inputs that tensorflow must be given to work.
The preceding code runs very slowly. It is little by little adjusting itself in order to get the best possible performance from our training set. Of course, we know that the ultimate test here is testing our network on an unknown test set, which is also given to us from tensorflow:
x_mnist_test = mnist.test.images
y_mnist_test = mnist.test.labels.astype(int)
x_mnist_test.shape
(10000, 784)
y_mnist_test.shape
(10000,)
So we have 10,000 images to test on; let's see how our network was able to adapt to the dataset:
# Evaluate accuracy.
accuracy_score = classifier.evaluate(x=x_mnist_test,
y=y_mnist_test)["accuracy"]
print('Accuracy: {0:f}'.format(accuracy_score))
Accuracy: 0.920600
Not bad, 92% accuracy on our dataset. Let's take a second and compare this performance to a standard sklearn logistic regression now:
logreg = LogisticRegression()
logreg.fit(x_mnist, y_mnist)
# Warning this is slow
y_predicted = logreg.predict(x_mnist_test)
from sklearn.metrics import accuracy_score
# predict on our test set, to avoid overfitting!
accuracy = accuracy_score(y_predicted, y_mnist_test)
# get our accuracy score
Accuracy
0.91969
Success! Our neural network performed better than the standard logistic regression. This is likely because the network is attempting to find relationships between the pixels themselves and using these relationships to map them to what digit we are writing down. In logistic regression, the model assumes that every single input is independent of one another, and therefore has a tough time finding relationships between them.
There are ways of making our neural network learn differently:
# A wider network
feature_columns = [tf.contrib.layers.real_valued_column("",
dimension=784)]
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.1)
# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.contrib.learn.DNNClassifier(feature_
columns=feature_columns,
hidden_units=[1500],
optimizer=optimizer,
n_classes=10)
# Fit model.
classifier.fit(x=x_mnist,
y=y_mnist,
steps=100)
# Warning this is veryyyyyyyy slow
# Evaluate accuracy.
accuracy_score = classifier.evaluate(x=x_mnist_test,
y=y_mnist_test)["accuracy"]
print('Accuracy: {0:f}'.format(accuracy_score))
Accuracy: 0.898400
There you go! You’ve now learned how to build a neural net in Tensorflow! If you liked this tutorial and would like to learn more, head over and grab the copy Principles of Data Science.
If you want to take things a bit further and learn how to classify Irises using multi-layer perceptrons, head over here.