You're reading from Deep Learning with Theano Perform large-scale numerical and scientific computations efficiently

Product type Paperback

Published in Jul 2017

Publisher Packt

ISBN-13 9781786465825

Length 300 pages

Edition 1st Edition

Tools

Theano

Concepts

Deep Learning

Author (1):

Christopher Bourez

View More author details

Table of Contents (15) Chapters

Preface

1. Theano Basics

2. Classifying Handwritten Digits with a Feedforward Network FREE CHAPTER

3. Encoding Word into Vector

4. Generating Text with a Recurrent Neural Net

5. Analyzing Sentiment with a Bidirectional LSTM

6. Locating with Spatial Transformer Networks

7. Classifying Images with Residual Networks

8. Translating and Explaining with Encoding – decoding Networks

9. Selecting Relevant Inputs or Memories with the Mechanism of Attention

10. Predicting Times Sequences with Advanced RNN

11. Learning from the Environment with Reinforcement

12. Learning Features with Unsupervised Generative Networks

13. Extending Deep Learning with Theano

Index

Graphs and symbolic computing

Let's take back the simple addition example and present different ways to display the same information:

>>> x = T.matrix('x')

>>> y = T.matrix('y')

>>> z = x + y

>>> z

Elemwise{add,no_inplace}.0

>>> theano.pp(z)

'(x + y)

>>> theano.printing.pprint(z)

'(x + y)'

>>> theano.printing.debugprint(z)
Elemwise{add,no_inplace} [id A] ''   
 |x [id B]
 |y [id C]

Here, the debugprint function prints the pre-compilation graph, the unoptimized graph. In this case, it is composed of two variable nodes, x and y, and an apply node, the elementwise addition, with the no_inplace option. The inplace option will be used in the optimized graph to save memory and re-use the memory of the input to store the result of the operation.

If the graphviz and pydot libraries have been installed, the pydotprint command outputs a PNG image of the graph:

>>> theano.printing.pydotprint(z)
The output file is available at ~/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/theano.pydotprint.gpu.png.

You might have noticed that the z.eval command takes while to execute the first time. The reason for this delay is the time required to optimize the mathematical expression and compile the code for the CPU or GPU before being evaluated.

The compiled expression can be obtained explicitly and used as a function that behaves as a traditional Python function:

>>> addition = theano.function([x, y], [z])

>>> addition([[1, 2], [1, 3]], [[1, 0], [3, 4]])
[array([[ 2.,  2.],
       [ 4.,  7.]], dtype=float32)]

The first argument in the function creation is a list of variables representing the input nodes of the graph. The second argument is the array of output variables. To print the post compilation graph, use this command:

>>> theano.printing.debugprint(addition)
HostFromGpu(gpuarray) [id A] ''   3
 |GpuElemwise{Add}[(0, 0)]<gpuarray> [id B] ''   2
   |GpuFromHost<None> [id C] ''   1
   | |x [id D]
   |GpuFromHost<None> [id E] ''   0
     |y [id F]

>>> theano.printing.pydotprint(addition)

The output file is available at ~/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/theano.pydotprint.gpu.png:

This case has been printed while using the GPU. During compilation, each operation has chosen the available GPU implementation. The main program still runs on CPU, where the data resides, but a GpuFromHost instruction performs a data transfer from the CPU to the GPU for input, while the opposite operation, HostFromGpu, fetches the result for the main program to display it:

Theano performs some mathematical optimizations, such as grouping elementwise operations, adding a new value to the previous addition:

>>> z= z * x

>>> theano.printing.debugprint(theano.function([x,y],z))
HostFromGpu(gpuarray) [id A] ''   3
 |GpuElemwise{Composite{((i0 + i1) * i0)}}[(0, 0)]<gpuarray> [id B] ''   2
   |GpuFromHost<None> [id C] ''   1
   | |x [id D]
   |GpuFromHost<None> [id E] ''   0
     |y [id F]

The number of nodes in the graph has not increased: two additions have been merged into one node. Such optimizations make it more tricky to debug, so we'll show you at the end of this chapter how to disable optimizations for debugging.

Lastly, let's see a bit more about setting the initial value with NumPy:

>>> theano.config.floatX
'float32'

>>> x = T.matrix()

>>> x
<TensorType(float32, matrix)>

>>> y = T.matrix()

>>> addition = theano.function([x, y], [x+y])

>>> addition(numpy.ones((2,2)),numpy.zeros((2,2)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/theano/compile/function_module.py", line 786, in __call__
    allow_downcast=s.allow_downcast)

  File "/usr/local/lib/python2.7/site-packages/theano/tensor/type.py", line 139, in filter
    raise TypeError(err_msg, data)
TypeError: ('Bad input argument to theano function with name "<stdin>:1"  at index 0(0-based)', 'TensorType(float32, matrix) cannot store a value of dtype float64 without risking loss of precision. If you do not mind this loss, you can: 1) explicitly cast your data to float32, or 2) set "allow_input_downcast=True" when calling "function".', array([[ 1.,  1.],
       [ 1.,  1.]]))

Executing the function on the NumPy arrays throws an error related to loss of precision, since the NumPy arrays here have float64 and int64 dtypes, but x and y are float32. There are multiple solutions to this; the first is to create the NumPy arrays with the right dtype:

>>> import numpy

>>> addition(numpy.ones((2,2), dtype=theano.config.floatX),numpy.zeros((2,2), dtype=theano.config.floatX))
[array([[ 1.,  1.],
        [ 1.,  1.]], dtype=float32)]

Alternatively, cast the NumPy arrays (in particular for numpy.diag, which does not allow us to choose the dtype directly):

>>> addition(numpy.ones((2,2)).astype(theano.config.floatX),numpy.diag((2,3)).astype(theano.config.floatX))
[array([[ 3.,  1.],
        [ 1.,  4.]], dtype=float32)]

Or we could allow downcasting:

>>> addition = theano.function([x, y], [x+y],allow_input_downcast=True)

>>> addition(numpy.ones((2,2)),numpy.zeros((2,2)))
[array([[ 1.,  1.],
        [ 1.,  1.]], dtype=float32)]