You're reading from Modern Python Standard Library Cookbook Over 100 recipes to fully leverage the features of the standard library in Python

Product type Paperback

Published in Aug 2018

Publisher Packt

ISBN-13 9781788830829

Length 366 pages

Edition 1st Edition

Languages

Python

Concepts

Programming Language

Author (1):

Alessandro Molina

View More author details

Ordered dictionaries

One of the most surprising aspects of Python dictionaries for new users is that their order is unpredictable and can change from environment to environment. So, the order of keys you expected on your system might be totally different on your friend's computer.

This frequently causes unexpected failures during tests; if a continuous integration system is involved, the ordering of dictionary keys on the system running the tests can be different from the ordering on your system, which might lead to random failures.

Suppose you have a snippet of code that generates an HTML tag with some attributes:

>>> attrs = dict(style="background-color:red", id="header")
>>> '<span {}>'.format(' '.join('%s="%s"' % a for a in attrs.items()))
'<span id="header" style="background-color:red">'

It might surprise you that on some systems you end up with this:

'<span id="header" style="background-color:red">'

While on others, the result might be this:

'<span style="background-color:red" id="header">'

So, if you expect to be able to compare the resulting string to check whether your function did the right thing when generating this tag, you might be disappointed.

How to do it...

Keys ordering is a very convenient feature and in some cases, it's actually necessary, so the Python standard library comes to help and provides the collections.OrderedDict container.

In the case of collections.OrderedDict, the keys are always in the order they were inserted in:

>>> attrs = OrderedDict([('id', 'header'), ('style', 'background-color:red')])
>>> '<span {}>'.format(' '.join('%s="%s"' % a for a in attrs.items()))
'<span id="header" style="background-color:red">'

How it works...

OrderedDict stores both a mapping of the keys to their values and a list of keys that is used to preserve the order of them.

So whenever your look for a key, the lookup goes through the mapping, but whenever you want to list the keys or iterate over the container, you go through the list of keys to ensure they are processed in the order they were inserted in.

The main problem when using OrderedDict is that Python on versions before 3.6 didn't guarantee any specific order of keyword arguments:

>>> attrs = OrderedDict(id="header", style="background-color:red")

This would have again introduced a totally random order of keys even though OrderedDict was used. Not because OrderedDict didn't preserve the order of those keys, but because it would have received them in a random order.

Thanks to PEP 468, the order of arguments is now guaranteed in Python 3.6 and newer versions (the order of dictionaries is still not, remember; so far it's just by chance that they are ordered). So if you are using Python 3.6 or newer, our previous example would work as expected, but if you are on older versions of Python, you would end up with a random order.

Thankfully, this is an issue that is easily solved. Like standard dictionaries, OrderedDict supports any iterable as the source of its content. As long as the iterable provides a key and a value, it can be used to build OrderedDict.

So by providing the keys and values in a tuple, we can provide them at construction time and preserve the order in any Python version:

>>> OrderedDict((('id', 'header'), ('style', 'background-color:red')))
OrderedDict([('id', 'header'), ('style', 'background-color:red')])

There's more...

Python 3.6 introduced a guarantee of preserving the order of dictionary keys as a side effect of some changes to dictionaries, but it was considered an internal implementation detail and not a language guarantee. Since Python 3.7, it became an official feature of the language so it's actually safe to rely on dictionary ordering if you are using Python 3.6 or newer.