Theano Op in C for CPU
Another inefficiency arises from the fact the Python implementation of an operator adds a significant overhead each time computations are performed, that is, for each instance of our operator in the graph. The Python code is not compiled as the rest of the graph by Theano in C and the overhead occurs when the C implementation is wrapped into Python and data is exchanged.
To remedy this, it is possible to directly write some C code that will be incorporated into the code of the rest of the graph and compiled together.
When implementing an operator directly in C, NumPy is the underlying library to manage arrays, with the the NumPy-API extending Python C-API. The Python class defining the new C operator does not have to implement the perform()
method; instead, it returns the C code to incorporate in the c_code()
, c_support_code()
and c_support_code_apply()
methods:
def c_code_cache_version(self): return (6, 0) def c_support_code(self): c_support_code = ""...