In the previous sections, we've been through the two different ways to write microservices: asynchronous versus synchronous, and whatever technique you use, the speed of Python directly impacts the performance of your microservice.
Of course, everyone knows Python is slower than Java or Go, but execution speed is not always the top priority. A microservice is often a thin layer of code that sits most of its life waiting for some network responses from other services. Its core speed is usually less important than how fast your SQL queries will take to return from your Postgres server, because the latter will represent most of the time spent to build the response.
But wanting an application that's as fast as possible is legitimate.
One controversial topic in the Python community around speeding up the language is how the Global Interpreter Lock (GIL) mutex can ruin performances, because multi-threaded applications cannot use several processes.
The GIL has good reasons to exist. It protects non-thread-safe parts of the CPython interpreter, and exists in other languages like Ruby. And all attempts to remove it so far have failed to produce a faster CPython implementation.
Larry Hasting is working on a GIL-free CPython project called
Gilectomy (
https://github.com/larryhastings/gilectomy). Its minimal goal is to come up with a GIL-free implementation, which can run a single-threaded application as fast as CPython. As of the time of this writing, this implementation is still slower that CPython. But it's interesting to follow this work, and see if it reaches speed parity one day. That would make a GIL-free CPython very appealing.
For microservices, besides preventing the usage of multiple cores in the same process, the GIL will slightly degrade performances on high load because of the system calls overhead introduced by the mutex.
However, all the scrutiny around the GIL has been beneficial: work has been done in the past years to reduce GIL contention in the interpreter, and in some areas, Python's performance has improved a lot.
Bear in mind that even if the core team removes the GIL, Python is an interpreted and garbage collected language and suffers performance penalties for those properties.
Python provides the dis module if you are interested to see how the interpreter decomposes a function. In the following example, the interpreter will decompose a simple function that yields incremented values from a sequence in no less than 29 steps:
>>> def myfunc(data):
... for value in data:
... yield value + 1
...
>>> import dis
>>> dis.dis(myfunc)
2 0 SETUP_LOOP 23 (to 26)
3 LOAD_FAST 0 (data)
6 GET_ITER
>> 7 FOR_ITER 15 (to 25)
10 STORE_FAST 1 (value)
3 13 LOAD_FAST 1 (value)
16 LOAD_CONST 1 (1)
19 BINARY_ADD
20 YIELD_VALUE
21 POP_TOP
22 JUMP_ABSOLUTE 7
>> 25 POP_BLOCK
>> 26 LOAD_CONST 0 (None)
29 RETURN_VALUE
A similar function written in a statically compiled language will dramatically reduce the number of operations required to produce the same result. There are ways to speed up Python execution, though.
One is to write a part of your code into compiled code by building C extensions, or using a static extension of the language like Cython (http://cython.org/), but that makes your code more complicated.
Another solution, which is the most promising one, is by simply running your application using the PyPy interpreter (http://pypy.org/).
PyPy implements a Just-In-Time (JIT) compiler. This compiler directly replaces, at runtime, pieces of Python with machine code that can be directly used by the CPU. The whole trick for the JIT is to detect in real time, ahead of the execution, when and how to do it.
Even if PyPy is always a few Python versions behind CPython, it has reached a point where you can use it in production, and its performances can be quite amazing. In one of our projects at Mozilla that needs fast execution, the PyPy version was almost as fast as the Go version, and we've decided to use Python there instead.
The Pypy Speed Center website is a great place to look at how PyPy compares to CPython (
http://speed.pypy.org/).
However, if your program uses C extensions, you will need to recompile them for PyPy, and that can be a problem. In particular, if other developers maintain some of the extensions you are using.
But if you build your microservice with a standard set of libraries, chances are that it will work out of the box with the PyPy interpreter, so that's worth a try.
In any case, for most projects, the benefits of Python and its ecosystem largely surpass the performance issues described in this section, because the overhead in a microservice is rarely a problem. And if performance is a problem, the microservice approach allows you to rewrite performance-critical components without affecting the rest of the system.