You're reading from Modern Python Standard Library Cookbook Over 100 recipes to fully leverage the features of the standard library in Python

Product type Paperback

Published in Aug 2018

Publisher Packt

ISBN-13 9781788830829

Length 366 pages

Edition 1st Edition

Languages

Python

Concepts

Programming Language

Author (1):

Alessandro Molina

View More author details

MultiDict

If you have ever need to provide a reverse mapping, you have probably discovered that Python lacks a way to store more than a value for each key in a dictionary. This is a very common need, and most languages provide some form of multimap container.

Python tends to prefer having a single way of doing things, and as storing multiple values for the key means just storing a list of values for a key, it doesn't provide a specialized container.

The issue with storing a list of values is that to be able to append to values to our dictionary, the list must already exist.

How to do it...

Proceed with the following steps for this recipe:

As we already know, defaultdict will create a default value by calling the provided callable for every missing key. We can provide the list constructor as a callable:

>>> from collections import defaultdict
>>> rd = defaultdict(list)

So, we insert keys into our multimap by using rd[k].append(v) instead of the usual rd[k] = v:

>>> for name, num in [('ichi', 1), ('one', 1), ('uno', 1), ('un', 1)]:
...   rd[num].append(name)
...
>>> rd
defaultdict(<class 'list'>, {1: ['ichi', 'one', 'uno', 'un']})

How it works...

MultiDict works by storing a list for each key. Whenever a key is accessed, the list containing all the values for that key is retrieved.

In the case of missing keys, an empty list will be provided so that values can be added for that key.

This works because every time defaultdict faces a missing key, it will insert it with a value generated by calling list. And calling list will actually provide an empty list. So, doing rd[v] will always provide a list, empty or not, depending on whether v was an already existing key or not. Once we have our list, adding a new value is just a matter of appending it.

There's more...

Dictionaries in Python are associative containers where keys are unique. A key can appear a single time and has exactly one value.

If we want to support multiple values per key, we can actually solve the need by saving list as the value of our key. This list can then contain all the values we want to keep around for that key:

>>> rd = {1: ['one', 'uno', 'un', 'ichi'],
...       2: ['two', 'due', 'deux', 'ni'],
...       3: ['three', 'tre', 'trois', 'san']}
>>> rd[2]
['two', 'due', 'deux', 'ni']

If we want to add a new translation to 2 (Spanish, for example), we would just have to append the entry:

>>> rd[2].append('dos')
>>> rd[2]
['two', 'due', 'deux', 'ni', 'dos']

The problem arises when we want to introduce a new key:

>>> rd[4].append('four')
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
KeyError: 4

For key 4, no list exists, so there is nowhere we can append it. So, our snippet to automatically reverse the mapping can't be easily adapted to handle multiple values, as it would fail with key errors the first time it tries to insert a value:

>>> rd = {}
>>> for k,v in d.items():
...     rd[v].append(k)
Traceback (most recent call last):
    File "<stdin>", line 2, in <module>
KeyError: 1

Checking for every single entry, whether it's already in the dictionary or not, and acting accordingly is not very convenient. While we can rely on the setdefault method of dictionaries to hide that check, we can get a far more elegant solution by using collections.defaultdict.