One of the most surprising aspects of Python dictionaries for new users is that their order is unpredictable and can change from environment to environment. So, the order of keys you expected on your system might be totally different on your friend's computer.
This frequently causes unexpected failures during tests; if a continuous integration system is involved, the ordering of dictionary keys on the system running the tests can be different from the ordering on your system, which might lead to random failures.
Suppose you have a snippet of code that generates an HTML tag with some attributes:
>>> attrs = dict(style="background-color:red", id="header") >>> '<span {}>'.format(' '.join('%s="%s"' % a for a in attrs.items())) '<span id="header" style="background-color:red">'
It might surprise you that on some systems you end up with this:
'<span id="header" style="background-color:red">'
While on others, the result might be this:
'<span style="background-color:red" id="header">'
So, if you expect to be able to compare the resulting string to check whether your function did the right thing when generating this tag, you might be disappointed.