Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Modern Python Standard Library Cookbook
Modern Python Standard Library Cookbook

Modern Python Standard Library Cookbook: Over 100 recipes to fully leverage the features of the standard library in Python

eBook
$38.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with Print?

Product feature icon Instant access to your digital copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Redeem a companion digital copy on all Print orders
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Table of content icon View table of contents Preview book icon Preview Book

Modern Python Standard Library Cookbook

Containers and Data Structures

In this chapter, we will cover the following recipes:

  • Counting frequencies—count occurrences of any hashable value
  • Dictionary with fallback—have a fallback value for any missing key
  • Unpacking multiplekeyword arguments—how to use ** more than once
  • Ordered dictionariesmaintaining order of keys in a dictionary
  • MultiDictdictionary with multiple values per key
  • Prioritizing entriesefficiently get the top of sorted entries
  • Bunchdictionaries that behave like objects
  • Enumerationshandle a known set of states

Introduction

Python has a very easy and flexible set of built-in containers. As a Python developer, there is little you can't achieve with a dict or a list. The convenience of Python dictionaries and lists is such that developers often forget that those have limits. Like any data structure, they are optimized and designed for specific use cases and might be inefficient in some conditions, or even unable to handle them.

Ever tried to put a key in a dictionary twice? Well you can't, because Python dictionaries are designed as hash tables with unique keys, but the MultiDict recipe will show you how to do that. Ever tried to grab the lowest/highest values out of a list without traversing it whole? The list itself can't, but in the Prioritized entries recipe, we will see how to achieve that.

The limits of standard Python containers are well known to Python experts. For that reason, the standard library has grown over the years to overcome those limits, and frequently there are patterns so common that their name is widely recognized, even though they are not formally defined.

Counting frequencies

A very common need in many kinds of programs is to count the occurrences of a value or of an event, which means counting frequency. Be it the need to count words in text, count likes on a blog post, or track scores for players of a video game, in the end counting frequency means counting how many we have of a specific value.

The most obvious solution for such a need would be to keep around counters for the things we need to count. If there are two, three, or four, maybe we can just track them in some dedicated variables, but if there are hundreds, it's certainly not feasible to keep around such a large amount of variables and we will quickly end up with a solution based on a container to collect all those counters.

How to do it...

Here are the steps for this recipe:

  1. Suppose we want to track the frequency of words in text; the standard library comes to our rescue and provides us with a very good way to track counts and frequencies, which is through the dedicated collections.Counter object.
  2. The collections.Counter object not only keeps track of frequencies, but provides some dedicated methods to retrieve the most common entries, entries that appear at last once and quickly count any iterable.
  3. Any iterable you provide to the Counter is "counted" for its frequency of values:
>>> txt = "This is a vast world you can't traverse world in a day"
>>>
>>> from collections import Counter
>>> counts = Counter(txt.split())
  1. The result would be exactly what we expect, a dictionary with the frequencies of the words in our phrase:
Counter({'a': 2, 'world': 2, "can't": 1, 'day': 1, 'traverse': 1, 
         'is': 1, 'vast': 1, 'in': 1, 'you': 1, 'This': 1})
  1. Then, we can easily query for the most frequent words:
>>> counts.most_common(2)
[('world', 2), ('a', 2)]

  1. Get the frequency of a specific word:
>>> counts['world']
2

Or, get back the total number of occurrences:

>>> sum(counts.values())
12
  1. And we can even apply some set operations on counters, such as joining them, subtracting them, or checking for intersections:
>>> Counter(["hello", "world"]) + Counter(["hello", "you"])
Counter({'hello': 2, 'you': 1, 'world': 1})
>>> Counter(["hello", "world"]) & Counter(["hello", "you"])
Counter({'hello': 1})

How it works...

Our counting code relies on the fact that Counter is just a special kind of dictionary, and that dictionaries can be built by providing an iterable. Each entry in the iterable will be added to the dictionary.

In the case of a counter, adding an element means incrementing its count; for every "word" in our list, we add that word multiple times (one every time it appears in the list), so its value in the Counter continues to get incremented every time the word is encountered.

There's more...

Relying on Counter is actually not the only way to track frequencies; we already know that Counter is a special kind of dictionary, so reproducing the Counter behavior should be quite straightforward.

Probably every one of us came up with a dictionary in this form:

counts = dict(hello=0, world=0, nice=0, day=0)

Whenever we face a new occurrence of hello, world, nice, or day, we increment the associated value in the dictionary and call it a day:

for word in 'hello world this is a very nice day'.split():
    if word in counts:
        counts[word] += 1

By relying on dict.get, we can also easily adapt it to count any word, not just those we could foresee:

for word in 'hello world this is a very nice day'.split():
    counts[word] = counts.get(word, 0) + 1

But the standard library actually provides a very flexible tool that we can use to improve this code even further, collections.defaultdict.

defaultdict is a plain dictionary that won't throw KeyError for any missing value, but will call a function we can provide to generate the missing value.

So, something such as defaultdict(int) will create a dictionary that provides 0 for any key that it doesn't have, which is very convenient for our counting purpose:

from collections import defaultdict

counts = defaultdict(int)
for word in 'hello world this is a very nice day'.split():
    counts[word] += 1

The result will be exactly what we expect:

defaultdict(<class 'int'>, {'day': 1, 'is': 1, 'a': 1, 'very': 1, 'world': 1, 'this': 1, 'nice': 1, 'hello': 1})

As for each word, the first time we face it, we will call int to get the starting value and then add 1 to it. As int gives 0 when called without any argument, that achieves what we want.

While this roughly solves our problem, it's far from being a complete solution for counting—we track frequencies, but on everything else, we are on our own. What if we want to know the most frequent entry in our bag of words?

The convenience of Counter is based on the set of additional features specialized for counting that it provides; it's not just a dictionary with a default numeric value, it's a class specialized in keeping track of frequencies and providing convenient ways to access them.

Dictionary with fallback

When working with configuration values, it's common to look them up in multiple places—maybe we load them from a configuration file—but we can override them with an environment variable or a command-line option, and in case the option is not provided, we can have a default value.

This can easily lead to long chains of if statements like these:

value = command_line_options.get('optname')
if value is None:
    value = os.environ.get('optname')
if value is None:
    value = config_file_options.get('optname')
if value is None:
    value = 'default-value'

This is annoying, and while for a single value it might be just annoying, it will tend to grow into a huge, confusing list of conditions as more options get added.

Command-line options are a very frequent use case, but the problem is related to chained scopes resolution. Variables in Python are resolved by looking at locals(); if they are not found, the interpreter looks at globals(), and if they are not yet found, it looks for built-ins.

How to do it...

For this step, you need to go through the following steps:

  1. The alternative for chaining default values of dict.getinstead of using multiple if instances, probably wouldn't improve much the code and if we want to add one additional scope, we would have to add it in every single place where we are looking up the values.
  2. collections.ChainMap is a very convenient solution to this problem; we can provide a list of mapping containers and it will look for a key through them all.

 

  1. Our previous example involving multiple different if instances can be converted to something like this:
import os
from collections import ChainMap

options = ChainMap(command_line_options, os.environ, config_file_options)
value = options.get('optname', 'default-value')
  1. We can also get rid of the last .get call by combining ChainMap with defaultdict. In this case, we can use defaultdict to provide a default value for every key:
import os
from collections import ChainMap, defaultdict

options = ChainMap(command_line_options, os.environ, config_file_options,
                   defaultdict(lambda: 'default-value'))
value = options['optname']
value2 = options['other-option']
  1. Print value and value2 will result in the following:
optvalue
default-value

optname will be retrieved from the command_line_options containing it, while other-option will end up being resolved by defaultdict.

How it works...

The ChainMap class receives multiple dictionaries as arguments; whenever a key is requested to ChainMap, it's actually going through the provided dictionaries one by one to check whether the key is available in any of them. Once the key is found, it is returned, as if it was a key owned by ChainMap itself.

The default value for options that are not provided is implemented by having defaultdict as the last dictionary provided to ChainMap. Whenever a key is not found in any of the previous dictionaries, it gets looked up in defaultdict, which uses the provided factory function to return a default value for all keys.

There's more...

Another great feature of ChainMap is that it allows updating too, but instead of updating the dictionary where it found the key, it always updates the first dictionary. The result is the same, as on next lookup of that key, we would have the first dictionary override any other value for that key (as it's the first place where the key is checked). The advantage is that if we provide an empty dictionary as the first mapping provided to ChainMap, we can change those values without touching the original container:

>>> population=dict(italy=60, japan=127, uk=65)
>>> changes = dict()
>>> editablepop = ChainMap(changes, population)

>>> print(editablepop['japan'])
127
>>> editablepop['japan'] += 1
>>> print(editablepop['japan'])
128

But even though we changed the population of Japan to 128 million, the original population didn't change:

>>> print(population['japan'])
127

And we can even use changes to find out which values were changed and which values were not:

>>> print(changes.keys()) 
dict_keys(['japan'])
>>> print(population.keys() - changes.keys())
{'italy', 'uk'}

It's important to know, by the way, that if the object contained in the dictionary is mutable and we directly mutate it, there is little ChainMap can do to avoid mutating the original object. So if, instead of numbers, we store lists in the dictionaries, we will be mutating the original dictionary whenever we append values to the dictionary:

>>> citizens = dict(torino=['Alessandro'], amsterdam=['Bert'], raleigh=['Joseph']) 
>>> changes = dict()
>>> editablecits = ChainMap(changes, citizens)
>>> editablecits['torino'].append('Simone')
>>> print(editablecits['torino']) ['Alessandro', 'Simone']
>>> print(changes)
{}
>>> print(citizens)
{'amsterdam': ['Bert'],
'torino': ['Alessandro', 'Simone'],
'raleigh': ['Joseph']}

Unpacking multiple keyword arguments

Frequently, you ended up in a situation where you had to provide arguments to a function from a dictionary. If you've ever faced that need, you probably also ended up in a case where you had to take the arguments from multiple dictionaries.

Generally, Python functions accept arguments from a dictionary through unpacking (the ** syntax), but so far, it hasn't been possible to use unpacking twice in the same call, nor was there an easy way to merge two dictionaries.

How to do it...

The steps for this recipe are:

  1. Given a function, f, we want to pass the arguments from two dictionaries, d1 and d2 as follows:
>>> def f(a, b, c, d):
... print (a, b, c, d)
...
>>> d1 = dict(a=5, b=6)
>>> d2 = dict(b=7, c=8, d=9)
  1. collections.ChainMap can help us achieve what we want; it can cope with duplicated entries and works with any Python version:
>>> f(**ChainMap(d1, d2))
5 6 8 9
  1. In Python 3.5 and newer versions, you can also create a new dictionary by combining multiple dictionaries through the literal syntax, and then pass the resulting dictionary as the argument of the function:
>>> f(**{**d1, **d2})
5 7 8 9
  1. In this case, the duplicated entries are accepted too, but are handled in reverse order of priority to ChainMap (so right to left). Notice how b has a value of 7, instead of the 6 it had with ChainMap, due to the reversed order of priorities.

This syntax might be harder to read due to the amount of unpacking operators involved, and with ChainMap it is probably more explicit what's happening for a reader.

How it works...

As we already know from the previous recipe, ChainMap looks up keys in all the provided dictionaries, so it's like the sum of all the dictionaries. The unpacking operator (**) works by inviting all keys to the container and then providing an argument for each key.

As ChainMap has keys resulting from the sum of all the provided dictionaries keys, it will provide the keys contained in all the dictionaries to the unpacking operator, thus allowing us to provide keyword arguments from multiple dictionaries.

There's more...

Since Python 3.5 through PEP 448, it's now possible to unpack multiple mappings to provide keyword arguments:

>>> def f(a, b, c, d):
...     print (a, b, c, d)
...
>>> d1 = dict(a=5, b=6)
>>> d2 = dict(c=7, d=8)
>>> f(**d1, **d2)
5 6 7 8

This solution is very convenient, but has two limits:

  • It's only available in Python 3.5+
  • It chokes on duplicated arguments

If you don't know where the mappings/dictionaries you are unpacking come from, it's easy to end up with the issue of duplicated arguments:

>>> d1 = dict(a=5, b=6)
>>> d2 = dict(b=7, c=8, d=9)
>>> f(**d1, **d2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: f() got multiple values for keyword argument 'b'

In the previous example, the b key is declared in both d1 and d2, and that causes the function to complain that it received duplicate arguments.

Ordered dictionaries

One of the most surprising aspects of Python dictionaries for new users is that their order is unpredictable and can change from environment to environment. So, the order of keys you expected on your system might be totally different on your friend's computer.

This frequently causes unexpected failures during tests; if a continuous integration system is involved, the ordering of dictionary keys on the system running the tests can be different from the ordering on your system, which might lead to random failures.

Suppose you have a snippet of code that generates an HTML tag with some attributes:

>>> attrs = dict(style="background-color:red", id="header")
>>> '<span {}>'.format(' '.join('%s="%s"' % a for a in attrs.items()))
'<span id="header" style="background-color:red">'

It might surprise you that on some systems you end up with this:

'<span id="header" style="background-color:red">'

While on others, the result might be this:

'<span style="background-color:red" id="header">'

So, if you expect to be able to compare the resulting string to check whether your function did the right thing when generating this tag, you might be disappointed.

How to do it...

Keys ordering is a very convenient feature and in some cases, it's actually necessary, so the Python standard library comes to help and provides the collections.OrderedDict container.

In the case of collections.OrderedDict, the keys are always in the order they were inserted in:

>>> attrs = OrderedDict([('id', 'header'), ('style', 'background-color:red')])
>>> '<span {}>'.format(' '.join('%s="%s"' % a for a in attrs.items()))
'<span id="header" style="background-color:red">'

How it works...

OrderedDict stores both a mapping of the keys to their values and a list of keys that is used to preserve the order of them.

So whenever your look for a key, the lookup goes through the mapping, but whenever you want to list the keys or iterate over the container, you go through the list of keys to ensure they are processed in the order they were inserted in.

The main problem when using OrderedDict is that Python on versions before 3.6 didn't guarantee any specific order of keyword arguments:

>>> attrs = OrderedDict(id="header", style="background-color:red")

This would have again introduced a totally random order of keys even though OrderedDict was used. Not because OrderedDict didn't preserve the order of those keys, but because it would have received them in a random order.

Thanks to PEP 468, the order of arguments is now guaranteed in Python 3.6 and newer versions (the order of dictionaries is still not, remember; so far it's just by chance that they are ordered). So if you are using Python 3.6 or newer, our previous example would work as expected, but if you are on older versions of Python, you would end up with a random order.

Thankfully, this is an issue that is easily solved. Like standard dictionaries, OrderedDict supports any iterable as the source of its content. As long as the iterable provides a key and a value, it can be used to build OrderedDict.

So by providing the keys and values in a tuple, we can provide them at construction time and preserve the order in any Python version:

>>> OrderedDict((('id', 'header'), ('style', 'background-color:red')))
OrderedDict([('id', 'header'), ('style', 'background-color:red')])

There's more...

Python 3.6 introduced a guarantee of preserving the order of dictionary keys as a side effect of some changes to dictionaries, but it was considered an internal implementation detail and not a language guarantee. Since Python 3.7, it became an official feature of the language so it's actually safe to rely on dictionary ordering if you are using Python 3.6 or newer.

MultiDict

If you have ever need to provide a reverse mapping, you have probably discovered that Python lacks a way to store more than a value for each key in a dictionary. This is a very common need, and most languages provide some form of multimap container.

Python tends to prefer having a single way of doing things, and as storing multiple values for the key means just storing a list of values for a key, it doesn't provide a specialized container.

The issue with storing a list of values is that to be able to append to values to our dictionary, the list must already exist.

How to do it...

Proceed with the following steps for this recipe:

  1. As we already know, defaultdict will create a default value by calling the provided callable for every missing key. We can provide the list constructor as a callable:
>>> from collections import defaultdict
>>> rd = defaultdict(list)
  1. So, we insert keys into our multimap by using rd[k].append(v) instead of the usual rd[k] = v:
>>> for name, num in [('ichi', 1), ('one', 1), ('uno', 1), ('un', 1)]:
...   rd[num].append(name)
...
>>> rd
defaultdict(<class 'list'>, {1: ['ichi', 'one', 'uno', 'un']})

How it works...

MultiDict works by storing a list for each key. Whenever a key is accessed, the list containing all the values for that key is retrieved.

In the case of missing keys, an empty list will be provided so that values can be added for that key.

This works because every time defaultdict faces a missing key, it will insert it with a value generated by calling list. And calling list will actually provide an empty list. So, doing rd[v] will always provide a list, empty or not, depending on whether v was an already existing key or not. Once we have our list, adding a new value is just a matter of appending it.

There's more...

Dictionaries in Python are associative containers where keys are unique. A key can appear a single time and has exactly one value.

If we want to support multiple values per key, we can actually solve the need by saving list as the value of our key. This list can then contain all the values we want to keep around for that key:

>>> rd = {1: ['one', 'uno', 'un', 'ichi'],
...       2: ['two', 'due', 'deux', 'ni'],
...       3: ['three', 'tre', 'trois', 'san']}
>>> rd[2]
['two', 'due', 'deux', 'ni']

If we want to add a new translation to 2 (Spanish, for example), we would just have to append the entry:

>>> rd[2].append('dos')
>>> rd[2]
['two', 'due', 'deux', 'ni', 'dos']

The problem arises when we want to introduce a new key:

>>> rd[4].append('four')
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
KeyError: 4

For key 4, no list exists, so there is nowhere we can append it. So, our snippet to automatically reverse the mapping can't be easily adapted to handle multiple values, as it would fail with key errors the first time it tries to insert a value:

>>> rd = {}
>>> for k,v in d.items():
...     rd[v].append(k)
Traceback (most recent call last):
    File "<stdin>", line 2, in <module>
KeyError: 1

Checking for every single entry, whether it's already in the dictionary or not, and acting accordingly is not very convenient. While we can rely on the setdefault method of dictionaries to hide that check, we can get a far more elegant solution by using collections.defaultdict.

Prioritizing entries

Picking the first/top entry of a set of values is a pretty frequent need; this usually means defining one value that has priority over the other and involves sorting.

But sorting can be expensive and re-sorting every time you add an entry to your values is certainly not a very convenient way to pick the first entry out of a set of values with some kind of priority.

How to do it...

Heaps are a perfect match for everything that has priorities, such as a priority queue:

import time
import heapq

class PriorityQueue:
    def __init__(self):
        self._q = []

    def add(self, value, priority=0):
        heapq.heappush(self._q, (priority, time.time(), value))

    def pop(self):
        return heapq.heappop(self._q)[-1]

Then, our PriorityQueue can be used to retrieve entries given a priority:

>>> def f1(): print('hello')
>>> def f2(): print('world')
>>>
>>> pq = PriorityQueue()
>>> pq.add(f2, priority=1)
>>> pq.add(f1, priority=0)
>>> pq.pop()()
hello
>>> pq.pop()()
world

How it works...

PriorityQueue works by storing everything in an heap. Heaps are particularly efficient at retrieving the top/first element of a sorted set without having to actually sort the whole set.

Our priority queue stores all the values in a three-element tuple: priority, time.time(), and value.

The first entry of our tuple is priority (lower is better). In the example, we recorded f1 with a better priority than f2, which ensures than when we use heap.heappop to fetch tasks to process, we get f1 and then f2, so that we end up with the hello world message and not world hello.

The second entry, timestamp, is used to ensure that tasks that have the same priority are processed in their insertion order. The oldest task will be served first as it will have the smallest timestamp.

Then, we have the value itself, which is the function we want call for our task.

There's more...

A very common approach to sorting is to keep a list of entries in a tuple, where the first element is key for which we are sorting and the second element is the value itself.

For a scoreboard, we can keep each player's name and how many points they got:

scores = [(123, 'Alessandro'),
          (143, 'Chris'),
          (192, 'Mark']

Storing those values in tuples works because comparing two tuples is performed by comparing each element of the first tuple with the element in the same index position in the other tuple:

>>> (10, 'B') > (10, 'A')
True
>>> (11, 'A') > (10, 'B')
True

It's very easy to understand what's going on if you think about strings. 'BB' > 'BB' is the same as ('B', 'B') > ('B', 'A'); in the end, a string is just a list of characters.

We can use this property to sort our scores and retrieve the winner of a competition:

>>> scores = sorted(scores)
>>> scores[-1]
(192, 'Mark')

The major problem with this approach is that every time we add an entry to our list, we have to sort it again, or our scoreboard would became meaningless:

>>> scores.append((137, 'Rick'))
>>> scores[-1]
(137, 'Rick')
>>> scores = sorted(scores)
>>> scores[-1]
(192, 'Mark')

This is very inconvenient because it's easy to miss re-sorting somewhere if we have multiple places appending to the list, and sorting the whole list every time can be expensive.

The Python standard library offers a data structure that is a perfect match when we're interested in finding out the winner of a competition.

In the heapq module, we have a fully working implementation of a heap data structure, a particular kind of tree where each parent is smaller than its children. This provides us with a tree that has a very interesting property: the root element is always the smallest one.

And being implemented on top of a list, it means that l[0] is always the smallest element in a heap:

>>> import heapq
>>> l = []
>>> heapq.heappush(l, (192, 'Mark'))
>>> heapq.heappush(l, (123, 'Alessandro'))
>>> heapq.heappush(l, (137, 'Rick'))
>>> heapq.heappush(l, (143, 'Chris'))
>>> l[0]
(123, 'Alessandro')

You might have noticed, by the way, that the heap finds the loser of our tournament, not the winner, and we were interested in finding the best player, with the highest value.

This is a minor problem we can easily solve by storing all scores as negative numbers. If we store each score as * -1, the head of the heap will always be the winner:

>>> l = []
>>> heapq.heappush(l, (-143, 'Chris'))
>>> heapq.heappush(l, (-137, 'Rick'))
>>> heapq.heappush(l, (-123, 'Alessandro'))
>>> heapq.heappush(l, (-192, 'Mark'))
>>> l[0]
(-192, 'Mark')

Bunch

Python is very good at shapeshifting objects. Each instance can have its own attributes and it's absolutely legal to add/remove the attributes of an object at runtime.

Once in a while, our code needs to deal with data of unknown shapes. For example, in the case of a user-submitted data, we might not know which fields the user is providing; maybe some of our users have a first name, some have a surname, and some have one or more middle name fields.

If we are not processing this data ourselves, but are just providing it to some other function, we really don't care about the shape of the data; as long as our objects have those attributes, we are fine.

A very common case is when working with protocols, if you are an HTTP server, you might want to provide to the application running behind you a request object. This object has a few known attributes, such as host and path, and it might have some optional attributes, such as a query string or a content type. But, it can also have any attribute the client provided, as HTTP is pretty flexible regarding headers, and our clients could have provided an x-totally-custom-header that we might have to expose to our code.

When representing this kind of data, Python developers often tend to look at dictionaries. In the end, Python objects themselves are built on top of dictionaries and they fit the need to map arbitrary values to names.

So, we will probably end up with something like the following:

>>> request = dict(host='www.example.org', path='/index.html')

A side effect of this approach is pretty clear once we have to pass this object around, especially to third-party code. Functions usually work with objects, and while they don't require a specific kind of object as duck-typing is the standard in Python, they will expect certain attributes to be there.

Another very common example is when writing tests, Python being a duck-typed language, it's absolutely reasonable to want to provide a fake object instead of providing a real instance of the object, especially when we need to simulate the values of some properties (as declared with @property), so we don't want or can't afford to create real instances of the object.

In such cases, using a dictionary is not viable as it will only provide access to its values through the request['path'] syntax and not through request.path, as probably expected by the functions we are providing our object to.

Also, the more we end up accessing this value, the more it's clear that the syntax using dot notation conveys the feeling of an entity that collaborates to the intent of the code, while a dictionary conveys the feeling of plain data.

As soon as we remember that Python objects can change shape at any time, we might be tempted to try creating an object instead of a dictionary. Unfortunately, we won't be able to provide the attributes at initialization time:

>>> request = object(host='www.example.org', path='/index.html')
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
TypeError: object() takes no parameters

Things don't improve much if we try to assign those attributes after the object is built:

>>> request = object()
>>> request.host = 'www.example.org'
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
AttributeError: 'object' object has no attribute 'host'

How to do it...

With a little effort, we can create a class that leverages dictionaries to contain any attribute we want and allow access both as a dictionary and through properties:

>>> class Bunch(dict):
...    def __getattribute__(self, key):
...        try: 
...            return self[key]
...        except KeyError:
...            raise AttributeError(key)
...    
...    def __setattr__(self, key, value): 
...        self[key] = value
...
>>> b = Bunch(a=5)
>>> b.a
5
>>> b['a']
5

How it works...

The Bunch class inherits dict, mostly as a way to provide a context where values can be stored, then most of the work is done by __getattribute__ and __setattr__. So, for any attribute that is retrieved or set on the object, they will just retrieve or set a key in self (remember we inherited from dict, so self is in fact a dictionary).

This allows the Bunch class to store and retrieve any value as an attribute of the object. The convenient feature is that it can behave both as an object and as a dict in most contexts.

For example, it is possible to find out all the values that it contains, like any other dictionary:

>>> b.items()
dict_items([('a', 5)])

It is also able to access those as attributes:

>>> b.c = 7
>>> b.c
7
>>> b.items()
dict_items([('a', 5), ('c', 7)])

There's more...

Our bunch implementation is not yet complete, as it will fail any test for class name (it's always named Bunch) and any test for inheritance, thus failing at faking other objects.

The first step is to make Bunch able to shapeshift not only its properties, but also its name. This can be achieved by creating a new class dynamically every time we create Bunch. The class will inherit from Bunch and will do nothing apart from providing a new name:

>>> class BunchBase(dict):
...    def __getattribute__(self, key):
...        try: 
...            return self[key]
...        except KeyError:
...            raise AttributeError(key)
...    
...    def __setattr__(self, key, value): 
...        self[key] = value
...
>>> def Bunch(_classname="Bunch", **attrs):
...     return type(_classname, (BunchBase, ), {})(**attrs)
>>>

The Bunch function moved from being the class itself to being a factory that will create objects that all act as Bunch, but can have different classes. Each Bunch will be a subclass of BunchBase, where the _classname name can be provided when Bunch is created:

>>> b = Bunch("Request", path="/index.html", host="www.example.org")
>>> print(b)
{'path': '/index.html', 'host': 'www.example.org'}
>>> print(b.path)
/index.html
>>> print(b.host)
www.example.org

This will allow us to create as many kinds of Bunch objects as we want, and each will have its own custom type:

>>> print(b.__class__)
<class '__main__.Request'>

The next step is to make our Bunch actually look like any other type that it has to impersonate. That is needed for the case where we want to use Bunch in place of another object. As Bunch can have any kind of attribute, it can take the place of any kind of object, but to be able to, it has to pass type checks for custom types.

We need to go back to our Bunch factory and make the Bunch objects not only have a custom class name, but also appear to be inherited from a custom parent.

To better understand what's going on, we will declare an example Person type; this type will be the one our Bunch objects will try to fake:

class Person(object):
    def __init__(name, surname):
        self.name = name
        self.surname = surname

    @property
    def fullname(self):
        return '{} {}'.format(self.name, self.surname)

Specifically, we are going to print Hello Your Name through a custom print function that only works for Person:

def hello(p):
    if not isinstance(p, Person):
        raise ValueError("Sorry, can only greet people")
    print("Hello {}".format(p.fullname))

We want to change our Bunch factory to accept the class and create a new type out of it:

def Bunch(_classname="Bunch", _parent=None, **attrs):
    parents = (_parent, ) if parent else tuple()
    return type(_classname, (BunchBase, ) + parents, {})(**attrs)

Now, our Bunch objects will appear as instances of a class named what we wanted, and will always appear as a subclass of _parent:

>>> p = Bunch("Person", Person, fullname='Alessandro Molina')
>>> hello(p)
Hello Alessandro Molina

Bunch can be a very convenient pattern; in both its complete and simplified versions, it is widely used in many frameworks with various implementations that all achieve pretty much the same result.

The showcased implementation is interesting because it gives us a clear idea of what's going on. There are ways to implement Bunch that are very smart, but might make it hard to guess what's going on and customize it.

Another possible way to implement the Bunch pattern is by patching the __dict__ class, which contains all the attributes of the class:

class Bunch(dict):
    def __init__(self, **kwds):
        super().__init__(**kwds)
        self.__dict__ = self

In this form, whenever Bunch is created, it will populate its values as a dict (by calling super().__init__, which is the dict initialization) and then, once all the attributes provided are stored in dict, it swaps the __dict__ object, which is the dictionary that contains all object attributes, with self. This makes the dict that was just populated with all the values also the dict that contains all the attributes of the object.

Our previous implementation worked by replacing the way we looked for attributes, while this implementation replaces the place where we look for attributes.

Enumerations

Enumeration is a common way to store values that can only represent a few states. Each symbolic name is bound to a specific value, usually numeric, that represents the states the enumeration can have.

Enumerations are very common in other programming languages, but until recently, Python didn't have any explicit support for enumerations.

How to do it...

Typically, enumerations are implemented by mapping symbolic names to numeric values; this is allowed in Python through enum.IntEnum:

>>> from enum import IntEnum
>>> 
>>> class RequestType(IntEnum):
...     POST = 1
...     GET = 2
>>>
>>> request_type = RequestType.POST
>>> print(request_type)
RequestType.POST

How it works...

IntEnum is an integer, apart from the fact that all possible values are created when the class is defined. IntEnum inherits from int, so its values are real integers.

During the RequestType definition, all the possible values for enum are declared within the class body and the values are verified against duplicates by the metaclass.

Also, enum provides support for a special value, auto, which means just put in a value, I don't care. As you usually only care whether it's POST or GET, you usually don't care whether POST is 1 or 2.

Last but not least, enumerations cannot be subclassed if they define at least one possible value.

There's more...

IntEnum values behave like int in most cases, which is usually convenient, but they can cause problems if the developer doesn't pay attention to the type.

For example, a function might unexpectedly perform the wrong thing if another enumeration or an integer value is provided, instead of the proper enumeration value:

>>> def do_request(kind):
...    if kind == RequestType.POST:
...        print('POST')
...    else:
...        print('OTHER')

As an example, invoking do_request with RequestType.POST or 1 will do exactly the same thing:

>>> do_request(RequestType.POST)
POST
>>> do_request(1)
POST

When we want to avoid treating our enumerations as numbers, we can use enum.Enum, which provides enumerated values that are not considered plain numbers:

>>> from enum import Enum
>>> 
>>> class RequestType(Enum):
...     POST = 1
...     GET = 2
>>>
>>> do_request(RequestType.POST)
POST
>>> do_request(1)
OTHER

So generally, if you need a simple set of enumerated values or possible states that rely on enumEnum is safer, but if you need a set of numeric values that rely on enumIntEnum will ensure that they behave like numbers.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Strategic recipes for effective application development in Python
  • Techniques to create GUIs and implement security through cryptography
  • Best practices for developing readily scalable, production-ready applications

Description

The Python 3 Standard Library is a vast array of modules that you can use for developing various kinds of applications. It contains an exhaustive list of libraries, and this book will help you choose the best one to address specific programming problems in Python. The Modern Python Standard Library Cookbook begins with recipes on containers and data structures and guides you in performing effective text management in Python. You will find Python recipes for command-line operations, networking, filesystems and directories, and concurrent execution. You will learn about Python security essentials in Python and get to grips with various development tools for debugging, benchmarking, inspection, error reporting, and tracing. The book includes recipes to help you create graphical user interfaces for your application. You will learn to work with multimedia components and perform mathematical operations on date and time. The recipes will also show you how to deploy different searching and sorting algorithms on your data. By the end of the book, you will have acquired the skills needed to write clean code in Python and develop applications that meet your needs.

Who is this book for?

If you are a developer who wants to write highly responsive, manageable, scalable, and resilient code in Python, this book is for you. Prior programming knowledge in Python will help you make the most out of the book.

What you will learn

  • Store multiple values per key in associative containers
  • Create interactive character-based user interfaces
  • Work with native time and display data for your time zone
  • Read/write SGML family languages, both as a SAX and DOM parser to meet file sizes and other requirements
  • Group equivalent items using itertools and sorted features together
  • Use partials to create unary functions out of multi-argument functions
  • Implement hashing algorithms to store passwords in a safe way
Estimated delivery fee Deliver to United States

Economy delivery 10 - 13 business days

Free $6.95

Premium delivery 6 - 9 business days

$21.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Aug 31, 2018
Length: 366 pages
Edition : 1st
Language : English
ISBN-13 : 9781788830829
Category :
Languages :

What do you get with Print?

Product feature icon Instant access to your digital copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Redeem a companion digital copy on all Print orders
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Estimated delivery fee Deliver to United States

Economy delivery 10 - 13 business days

Free $6.95

Premium delivery 6 - 9 business days

$21.95
(Includes tracking information)

Product Details

Publication date : Aug 31, 2018
Length: 366 pages
Edition : 1st
Language : English
ISBN-13 : 9781788830829
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 142.97
Clean Code in Python
$48.99
Modern Python Standard Library Cookbook
$54.99
Python Automation Cookbook
$38.99
Total $ 142.97 Stars icon

Table of Contents

15 Chapters
Containers and Data Structures Chevron down icon Chevron up icon
Text Management Chevron down icon Chevron up icon
Command Line Chevron down icon Chevron up icon
Filesystem and Directories Chevron down icon Chevron up icon
Date and Time Chevron down icon Chevron up icon
Read/Write Data Chevron down icon Chevron up icon
Algorithms Chevron down icon Chevron up icon
Cryptography Chevron down icon Chevron up icon
Concurrency Chevron down icon Chevron up icon
Networking Chevron down icon Chevron up icon
Web Development Chevron down icon Chevron up icon
Multimedia Chevron down icon Chevron up icon
Graphical User Interfaces Chevron down icon Chevron up icon
Development Tools Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.4
(7 Ratings)
5 star 85.7%
4 star 0%
3 star 0%
2 star 0%
1 star 14.3%
Filter icon Filter
Top Reviews

Filter reviews by




gsamn Aug 29, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Very knowledgeable writer. This book should be twice size or a book set of 3 with more break down. It's a good format
Amazon Verified review Amazon
Jason Crowe Jul 07, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great book for realizing the power of the python standard library. Probably not a great choice for beginners.
Amazon Verified review Amazon
Vlad Bezden Apr 17, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
It's a great book. I can only compare this book to "Effective Python: 90 Specific Ways to Write Better Python". I thought I know Python well. Well, after reading this book, I realized that I was mistaken. This book covers the hidden gems of Python standard library. In many cases, I was using third party libraries instead of using just regular standard libraries.This book also follows and teaches the proper Pythonic way of writing code.This book covers Python 3.6, and I wish Alessandro Molina updated this book with new Python libraries.
Amazon Verified review Amazon
Tom Bolt Mar 30, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
There's so much power available in the standard library I had no idea, I really like the idea of using the tools that are delivered before jumping right into another outside module. I enjoy the "recipes" because it conveys the idea and discussion in a somewhat small space so you can pace out the lessons and learn a lot in a small period of time. Not a beginners book however, you should have a good foundation prior to starting this. It presents really well the iPhone and iPad, not so much on Kindle's Web Reader. I think the price is too high for an ebook IMO.
Amazon Verified review Amazon
Jay Nov 29, 2018
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This is a good book by Alessandro Molina. He have used his 15+ years of experience to design and layout all the chapters of the book carefully.The book covers recipes and wide aspects of the language from simple day to day stuffs like containers, data structures, file system, date & time, read &write to some advanced topics like cryptography, concurrency, networking etc.The 300+ pages of the book, is relevant to intermediate to advanced python developers and covers over 100 recipes to fully leverage the features of the standard library in Python.It have lot of code snippets and all the recipes are explained well. The chapter layouts are designed in a way that make finding a topic easy for the readers.Each recipe have 'How it works' section that gives an insight of how that particular recipe works. Most of the recipes have 'There is more' section as well that dives deep in the topic for curious readers.It enables you to use the powerful features of python's standard library in most effective way.I keep this book on my rack of Effective python (by Brett Slatkin) and Python Cookbook shelf. :)Over all this is a good read.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the digital copy I get with my Print order? Chevron down icon Chevron up icon

When you buy any Print edition of our Books, you can redeem (for free) the eBook edition of the Print Book you’ve purchased. This gives you instant access to your book when you make an order via PDF, EPUB or our online Reader experience.

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela