Graphs and symbolic computing
Let's take back the simple addition example and present different ways to display the same information:
>>> x = T.matrix('x') >>> y = T.matrix('y') >>> z = x + y >>> z Elemwise{add,no_inplace}.0 >>> theano.pp(z) '(x + y) >>> theano.printing.pprint(z) '(x + y)' >>> theano.printing.debugprint(z) Elemwise{add,no_inplace} [id A] '' |x [id B] |y [id C]
Here, the debugprint
function prints the pre-compilation graph, the unoptimized graph. In this case, it is composed of two variable nodes, x
and y
, and an apply node, the elementwise addition, with the no_inplace
option. The inplace
option will be used in the optimized graph to save memory and re-use the memory of the input to store the result of the operation.
If the graphviz
and pydot
libraries have been installed, the pydotprint
command outputs a PNG image of the graph:
>>> theano.printing.pydotprint(z) The output file is available at ~/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/theano.pydotprint.gpu.png.
You might have noticed that the z.eval
command takes while to execute the first time. The reason for this delay is the time required to optimize the mathematical expression and compile the code for the CPU or GPU before being evaluated.
The compiled expression can be obtained explicitly and used as a function that behaves as a traditional Python function:
>>> addition = theano.function([x, y], [z]) >>> addition([[1, 2], [1, 3]], [[1, 0], [3, 4]]) [array([[ 2., 2.], [ 4., 7.]], dtype=float32)]
The first argument in the function creation is a list of variables representing the input nodes of the graph. The second argument is the array of output variables. To print the post compilation graph, use this command:
>>> theano.printing.debugprint(addition) HostFromGpu(gpuarray) [id A] '' 3 |GpuElemwise{Add}[(0, 0)]<gpuarray> [id B] '' 2 |GpuFromHost<None> [id C] '' 1 | |x [id D] |GpuFromHost<None> [id E] '' 0 |y [id F] >>> theano.printing.pydotprint(addition) The output file is available at ~/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/theano.pydotprint.gpu.png:
This case has been printed while using the GPU. During compilation, each operation has chosen the available GPU implementation. The main program still runs on CPU, where the data resides, but a GpuFromHost
instruction performs a data transfer from the CPU to the GPU for input, while the opposite operation, HostFromGpu
, fetches the result for the main program to display it:
Theano performs some mathematical optimizations, such as grouping elementwise operations, adding a new value to the previous addition:
>>> z= z * x >>> theano.printing.debugprint(theano.function([x,y],z)) HostFromGpu(gpuarray) [id A] '' 3 |GpuElemwise{Composite{((i0 + i1) * i0)}}[(0, 0)]<gpuarray> [id B] '' 2 |GpuFromHost<None> [id C] '' 1 | |x [id D] |GpuFromHost<None> [id E] '' 0 |y [id F]
The number of nodes in the graph has not increased: two additions have been merged into one node. Such optimizations make it more tricky to debug, so we'll show you at the end of this chapter how to disable optimizations for debugging.
Lastly, let's see a bit more about setting the initial value with NumPy:
>>> theano.config.floatX 'float32' >>> x = T.matrix() >>> x <TensorType(float32, matrix)> >>> y = T.matrix() >>> addition = theano.function([x, y], [x+y]) >>> addition(numpy.ones((2,2)),numpy.zeros((2,2))) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.7/site-packages/theano/compile/function_module.py", line 786, in __call__ allow_downcast=s.allow_downcast) File "/usr/local/lib/python2.7/site-packages/theano/tensor/type.py", line 139, in filter raise TypeError(err_msg, data) TypeError: ('Bad input argument to theano function with name "<stdin>:1" at index 0(0-based)', 'TensorType(float32, matrix) cannot store a value of dtype float64 without risking loss of precision. If you do not mind this loss, you can: 1) explicitly cast your data to float32, or 2) set "allow_input_downcast=True" when calling "function".', array([[ 1., 1.], [ 1., 1.]]))
Executing the function on the NumPy arrays throws an error related to loss of precision, since the NumPy arrays here have float64
and int64
dtypes
, but x
and y
are float32
. There are multiple solutions to this; the first is to create the NumPy arrays with the right dtype
:
>>> import numpy
>>> addition(numpy.ones((2,2), dtype=theano.config.floatX),numpy.zeros((2,2), dtype=theano.config.floatX))
[array([[ 1., 1.],
[ 1., 1.]], dtype=float32)]
Alternatively, cast the NumPy arrays (in particular for numpy.diag
, which does not allow us to choose the dtype
directly):
>>> addition(numpy.ones((2,2)).astype(theano.config.floatX),numpy.diag((2,3)).astype(theano.config.floatX)) [array([[ 3., 1.], [ 1., 4.]], dtype=float32)]
Or we could allow downcasting:
>>> addition = theano.function([x, y], [x+y],allow_input_downcast=True) >>> addition(numpy.ones((2,2)),numpy.zeros((2,2))) [array([[ 1., 1.], [ 1., 1.]], dtype=float32)]