Comprehensions and assignment expressions
We will see comprehension expressions many times throughout the book. This is because they're usually a more concise way of writing code, and in general, code written this way tends to be easier to read. I say in general, because sometimes if we need to do some transformations on the data we're collecting, using a comprehension might lead to some more complicated code. In these cases, writing a simple for
loop should be preferred instead.
There is, however, one last resort we could apply to try to salvage the situation: assignment expressions. In this section, we discuss these alternatives.
The use of comprehensions is recommended to create data structures in a single instruction, instead of multiple operations. For example, if we wanted to create a list with calculations over some numbers in it, instead of writing it like this:
numbers = []
for i in range(10):
numbers.append(run_calculation(i))
We would create the list directly:
numbers = [run_calculation(i) for i in range(10)]
Code written in this form usually performs better because it uses a single Python operation, instead of calling list.append
repeatedly. If you are curious about the internals or differences between different versions of the code, you can check out the dis
module, and call it with these examples.
Let's see the example of a function that will take some strings that represent resources on a cloud computing environment (for example ARNs), and returns the set with the account IDs found on them. Something like this would be the most naïve way of writing such a function:
from typing import Iterable, Set
def collect_account_ids_from_arns(arns: Iterable[str]) -> Set[str]:
"""Given several ARNs in the form
arn:partition:service:region:account-id:resource-id
Collect the unique account IDs found on those strings, and return them.
"""
collected_account_ids = set()
for arn in arns:
matched = re.match(ARN_REGEX, arn)
if matched is not None:
account_id = matched.groupdict()["account_id"]
collected_account_ids.add(account_id)
return collected_account_ids
Clearly the code has many lines, and it's doing something relatively simple. A reader of this code might get confused by these multiple statements, and perhaps inadvertently make a mistake when working with that code. If we could simplify it, that would be better. We can achieve the same functionality in fewer lines by using a few comprehension expressions in a way that resembles functional programming:
def collect_account_ids_from_arns(arns):
matched_arns = filter(None, (re.match(ARN_REGEX, arn) for arn in arns))
return {m.groupdict()["account_id"] for m in matched_arns}
The first line of the function seems similar to applying map
and filter
: first, we apply the result of trying to match the regular expression to all the strings provided, and then we filter those that aren't None
. The result is an iterator that we will later use to extract the account ID in a set comprehension expression.
The previous function should be more maintainable than our first example, but still requires two statements. Before Python 3.8, it wasn't possible to achieve a more compact version. But with the introduction of assignment expressions in PEP-572 (https://www.python.org/dev/peps/pep-0572/), we can rewrite this in a single statement:
def collect_account_ids_from_arns(arns: Iterable[str]) -> Set[str]:
return {
matched.groupdict()["account_id"]
for arn in arns
if (matched := re.match(ARN_REGEX, arn)) is not None
}
Note the syntax on the third line inside the comprehension. This sets a temporary identifier inside the scope, which is the result of applying the regular expression to the string, and it can be reused in more parts within the same scope.
In this particular example, it's arguable if the third example is better than the second one (but there should be no doubts that both of them are better than the first one!). I believe this last example to be more expressive because it has fewer indirections in the code, and everything that the reader needs to know on how the values are being collected belongs to the same scope.
Keep in mind that a more compact code does not always mean better code. If to write a one-liner, we have to create a convoluted expression, then it's not worth it, and we would be better off with the naïve approach. This is related to the keep it simple principle that we'll discuss in the next chapter.
Take into consideration the readability of the comprehension expressions, and don't force your code to be a one-liner, if this one won't be actually easier to understand.
Another good reason for using assignment expressions in general (not just in comprehensions) is the performance considerations. If we have to use a function as part of our transformation logic, we don't want to call that more than is necessary. Assigning the result of the function to a temporary identifier (as it's done by assignment expressions in new scopes) would be a good optimization technique that, at the same time, keeps the code more readable.
Evaluate the performance improvements that can be made by using assignment expressions.
In the next section, we'll review another idiomatic feature of Python: properties
. Moreover, we'll discuss the different ways of exposing or hiding data in Python objects.