Improving performance
Much can be said about performance optimization, but truthfully, if you have read the entire book up to this point, you know most of the Python-specific techniques to write fast code. The most important factor in application performance will always be the choice of algorithms, and by extension, the data structures. Searching for an item within list
is almost always a worse idea than searching for an item in dict
or set
.
Using the right algorithm
Within any application, the right choice of algorithm is by far the most important performance characteristic, which is why I am repeating it to illustrate the results of a bad choice:
In [1]: a = list(range(1000000)) In [2]: b = dict.fromkeys(range(1000000)) In [3]: %timeit 'x' in a 10 loops, best of 3: 20.5 ms per loop In [4]: %timeit 'x' in b 10000000 loops, best of 3: 41.6 ns per loop
Checking whether an item is within a list
is an O(n)
operation and checking whether an item is within a dict
is an O(1)
operation. A huge difference...