Theano Op in Python for the GPU
Let's take a look at what happens when we run this operator in a graph in the GPU config
mode:
>>> y = mult4plus5op(2 * x) + 4 * x >>> f = theano.function([x], y) >>> theano.printing.debugprint(f) HostFromGpu(gpuarray) [id A] '' 6 |GpuElemwise{Composite{(i0 + (i1 * i2))}}[(0, 0)]<gpuarray> [id B] '' 5 |GpuFromHost<None> [id C] '' 4 | |AXPBOp{a=4, b=5} [id D] '' 3 | |HostFromGpu(gpuarray) [id E] '' 2 | |GpuElemwise{mul,no_inplace} [id F] '' 1 | |GpuArrayConstant{[[ 2.]]} [id G] | |GpuFromHost<None> [id H] '' 0 | |<TensorType(float32, matrix)> [id I] |GpuArrayConstant{[[ 4.]]} [id J] |GpuFromHost<None> [id H] '' 0
Since we have only defined a CPU implementation of the new operator in Python and the full graph is running on GPU, the data is transferred...