In this article by Philip Herron, the author of the book Learning Cython Programming - Second Edition, we see how Cython is much more than just a programminglanguage. Its origin can be traced to Sage, the mathematics software package, where it was used to increase the performance of mathematical computations, such as those involving matrices. More generally, I tend to consider Cython as an alternative to Swig to generate really good python bindings to native code.
Language bindings have been around for years and Swig was one of the first and best tools to generate bindings for multitudes of languages. Cython generates bindings for Python code only, and this single purpose approach means it generates the best Python bindings you can get outside of doing it all manually; attempt the latter only if you're a Python core developer.
For me, taking control of legacy software by generating language bindings is a great way to reuse any software package. Consider a legacy application written in C/C++; adding advanced modern features like a web server for a dashboard or message bus is not a trivial thing to do. More importantly, Python comes with thousands of packages that have been developed, tested, and used by people for a long time, and can do exactly that. Wouldn't it be great to take advantage of all of this code? With Cython, we can do exactly this, and I will demonstrate approaches with plenty of example codes along the way.
This article will be dedicated to the core concepts on using Cython, including compilation, and will provide a solid reference and introduction for all to Cython core concepts.
In this article, we will cover:
(For more resources related to this topic, see here.)
Since Cython is a programming language, we must install its respective compiler, which just so happens to be so aptly named Cython.
There are many different ways to install Cython. The preferred one would be to use pip:
$ pip install Cython
This should work on both Linux and Mac. Alternatively, you can use your Linux distribution's package manager to install Cython:
$ yum install cython # will work on Fedora and Centos
$ apt-get install cython # will work on Debian based systems
In Windows, although there are a plethora of options available, following this Wiki is the safest option to stay up to date:
http://wiki.cython.org/InstallingOnWindows
There is an emacs mode available for Cython. Although the syntax is nearly the same as Python, there are differences that conflict in simply using Python mode. You can choose to grab the cython-mode.el from the Cython source code (inside the Tools directory.) The preferred way of installing packages to emacs would be to use a package repository such as MELPA().
To add the package repository to emacs, open your ~/.emacs configuration file and add the following code:
(when (>= emacs-major-version 24)
(require 'package)
(add-to-list
'package-archives
'("melpa" . "http://melpa.org/packages/")
t)
(package-initialize))
Once you add this and reload your configuration to install the cython mode, you can simply run the following:
'M-x package-install RET cython-mode'
Once this is installed, you can activate the mode by adding this into your emacs config file:
(require 'cython-mode)
You can always activate the mode manually at any time with the following:
'M-x cython-mode RET'
Throughout this book, I intend to show real examples that are easy to digest to help you get a feel of the different things you can achieve with Cython. To access and download the code used, please clone the following repository:
$ git clone git://github.com/redbrain/cython-book.git
As you will see when running the Hello World program, Cython generates native python modules. Therefore, while running any Cython code, you will reference it via a module import in Python. Let's build the module:
$ cd cython-book/chapter1/helloworld
$ make
You should have now created helloworld.so! This is a Cython module of the same name of the Cython source code file. While in the same directory of the shared object module, you can invoke this code by running a respective Python import:
$ python
Python 2.7.3 (default, Aug 1 2012, 05:16:07)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import helloworld
Hello World from cython!
As you can see from opening helloworld.pyx, it looks just like a normal Python Hello World application; but as previously stated, Cython generates modules. These modules need a name so that it can be correctly imported by the python runtime. The Cython compiler simply uses the name of the source code file. It then requires us to compile this to the same shared object name.
Overall, Cython source code files have the .pyx,.pxd, and .pxi extensions. For now, all we care about are the .pyx files; the others are for cimports and includes respectively within a .pyx module file.
The following screenshot depicts the compilation flow required to have a callable native python module:
I wrote a basic makefile so that you can simply run make to compile these examples. Here's the code to do this manually:
$ cython helloworld.pyx
$ gcc/clang -g -O2 -fpic `python-config --cflags` -c helloworld.c -o helloworld.o
$ gcc/clang -shared -o helloworld.so helloworld.o `python-config –libs
You can compile this using Python distutils and cythonize. Open setup.py:
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules = cythonize("helloworld.pyx")
)
Using the cythonize function as part of the ext_modules section will build any specified Cython source into an installable Python module. This will compile helloworld.pyx into the same shared library. This provides the Python practice to distribute native modules as part of distutils.
We should be careful when talking about Python and Cython for clarity, since the syntax is so similar. Let's wrap a simple AddFunction in C and make it callable from Python.
Firstly, open a file called AddFunction.c, and write a simple function into it:
#include <stdio.h>
int AddFunction(int a, int b) {
printf("look we are within your c code!n");
return a + b;
}
This is the C code we will call, which is just a simple function to add two integers. Now, let's get Python to call it. Open a file called AddFunction.h, wherein we will declare our prototype:
#ifndef __ADDFUNCTION_H__
#define __ADDFUNCTION_H__
extern int AddFunction (int, int);
#endif //__ADDFUNCTION_H__
We need this so that Cython can see the prototype for the function we want to call. In practice, you will already have your headers in your own project with your prototypes and declarations already available.
Open a file called AddFunction.pyx, and insert the following code in to it:
cdef extern from "AddFunction.h":
cdef int AddFunction(int, int)
Here, we have to declare what code we want to call. The cdef is a keyword signifying that this is from the C code that will be linked in. Now, we need a Python entry point:
def Add(a, b):
return AddFunction(a, b)
This Add is a Python callable inside a PyAddFunction module. Again, I have provided a handy makefile to produce the module:
$ cd cython-book/chapter1/ownmodule
$ make
cython -2 PyAddFunction.pyx
gcc -g -O2 -fpic -c PyAddFunction.c -o PyAddFunction.o `python-config --includes`
gcc -g -O2 -fpic -c AddFunction.c -o AddFunction.o
gcc -g -O2 -shared -o PyAddFunction.so AddFunction.o PyAddFunc-tion.o `python-config --libs`
Notice that AddFunction.c is compiled into the same PyAddFunction.so shared object. Now, let's call this AddFunction and check to see if C can add numbers correctly:
$ python
>>> from PyAddFunction import Add
>>> Add(1,2)
look we are within your c code!!
3
Notice the print statement inside AddFunction.c::AddFunction and that the final result is printed correctly. Therefore, we know the control hit the C code and did the calculation in C and not inside the Python runtime. This is a revelation to what is possible. Python can be cited to be slow in some circumstances. Using this technique, it makes it possible for Python code to bypass its own runtime and to run in an unsafe context, which is unrestricted by the Python runtime, which is much faster.
Notice that we had to declare a prototype inside the cython source code PyAddFunction.pyx:
cdef extern from "AddFunction.h":
cdef int AddFunction(int, int)
It let the compiler know that there is a function called AddFunction, and that it takes two int's and returns an int. This is all the information the compiler needs to know besides the host and target operating system's calling convention in order to call this function safely. Then, we created the Python entry point, which is a python callable that takes two parameters:
def Add(a, b):
return AddFunction(a, b)
Inside this entry point, it simply returned the native AddFunction and passed the two Python objects as parameters. This is what makes Cython so powerful. Here, the Cython compiler must inspect the function call and generate code to safely try and convert these Python objects to native C integers. This becomes difficult when precision is taken into account, and potential overflow, which just so happens to be a major use case since it handles everything so well. Also, remember that this function returns an integer and Cython also generates code to convert the integer return into a valid Python object.
Overall, we installed the Cython compiler, ran the Hello World example, and took into consideration that we need to compile all code into native shared objects. We also saw how to wrap native C code to be callable from Python, and how to do type conversion of parameters and return to C code and back to Python.
Further resources on this subject: