Context managers
Context managers are a distinctively useful feature that Python provides. The reason why they are so useful is that they correctly respond to a pattern. There are recurrent situations in which we want to run some code that has preconditions and postconditions, meaning that we want to run things before and after a certain main action, respectively. Context managers are great tools to use in those situations.
Most of the time, we see context managers around resource management. For example, in situations when we open files, we want to make sure that they are closed after processing (so we do not leak file descriptors). Or, if we open a connection to a service (or even a socket), we also want to be sure to close it accordingly, or when dealing with temporary files, and so on.
In all of these cases, you would normally have to remember to free all of the resources that were allocated and that is just thinking about the best case—but what about exceptions and error handling? Given the fact that handling all possible combinations and execution paths of our program makes it harder to debug, the most common way of addressing this issue is to put the cleanup code on a finally
block so that we are sure we do not miss it. For example, a very simple case would look like the following:
fd = open(filename)
try:
process_file(fd)
finally:
fd.close()
Nonetheless, there is a much more elegant and Pythonic way of achieving the same thing:
with open(filename) as fd:
process_file(fd)
The with
statement (PEP-343) enters the context manager. In this case, the open
function implements the context manager protocol, which means that the file will be automatically closed when the block is finished, even if an exception occurred.
Context managers consist of two magic methods: __enter__
and __exit
__. On the first line of the context manager, the with
statement will call the first method, __enter__
, and whatever this method returns will be assigned to the variable labeled after as
. This is optional—we don't really need to return anything specific on the __enter__
method, and even if we do, there is still no strict reason to assign it to a variable if it is not required.
After this line is executed, the code enters a new context, where any other Python code can be run. After the last statement on that block is finished, the context will be exited, meaning that Python will call the __exit__
method of the original context manager object we first invoked.
If there is an exception or error inside the context manager block, the __exit__
method will still be called, which makes it convenient for safely managing the cleaning up of conditions. In fact, this method receives the exception that was triggered on the block in case we want to handle it in a custom fashion.
Despite the fact that context managers are very often found when dealing with resources (like the example we mentioned with files, connections, and so on), this is not the sole application they have. We can implement our own context managers in order to handle the particular logic we need.
Context managers are a good way of separating concerns and isolating parts of the code that should be kept independent, because if we mix them, then the logic will become harder to maintain.
As an example, consider a situation where we want to run a backup of our database with a script. The caveat is that the backup is offline, which means that we can only do it while the database is not running, and for this we have to stop it. After running the backup, we want to make sure that we start the process again, regardless of how the process of the backup itself went.
Now, the first approach would be to create a huge monolithic function that tries to do everything in the same place, stop the service, perform the backup task, handle exceptions and all possible edge cases, and then try to restart the service again. You can imagine such a function, and for that reason, I will spare you the details, and instead come up directly with a possible way of tackling this issue with context managers:
def stop_database():
run("systemctl stop postgresql.service")
def start_database():
run("systemctl start postgresql.service")
class DBHandler:
def __enter__(self):
stop_database()
return self
def __exit__(self, exc_type, ex_value, ex_traceback):
start_database()
def db_backup():
run("pg_dump database")
def main():
with DBHandler():
db_backup()
In this example, we don't need the result of the context manager inside the block, and that's why we can consider that, at least for this particular case, the return value of __enter__
is irrelevant. This is something to take into consideration when designing context managers—what do we need once the block is started? As a general rule, it should be good practice (although not mandatory) to always return something on __enter__
.
In this block, we only run the task for the backup, independently from the maintenance tasks, as we saw previously. We also mentioned that even if the backup task has an error, __exit__
will still be called.
Notice the signature of the __exit__
method. It receives the values for the exception that was raised on the block. If there was no exception on the block, they are all none.
The return value of __exit__
is something to consider. Normally, we would want to leave the method as it is, without returning anything in particular. If this method returns True
, it means that the exception that was potentially raised will not propagate to the caller and will stop there. Sometimes, this is the desired effect, maybe even depending on the type of exception that was raised, but in general, it is not a good idea to swallow the exception. Remember: errors should never pass silently.
Keep in mind not to accidentally return True
on __exit__
. If you do, make sure that this is exactly what you want, and that there is a good reason for it.
Implementing context managers
In general, we can implement context managers like the one in the previous example. All we need is just a class that implements the __enter__
and __exit__
magic methods, and then that object will be able to support the context manager protocol. While this is the most common way for context managers to be implemented, it is not the only one.
In this section, we will see not only different (sometimes more compact) ways of implementing context managers, but also how to take full advantage of them by using the standard library, in particular with the contextlib
module.
The contextlib
module contains a lot of helper functions and objects to either implement context managers or use ones already provided that can help us write more compact code.
Let's start by looking at the contextmanager
decorator.
When the contextlib.contextmanager
decorator is applied to a function, it converts the code on that function into a context manager. The function in question has to be a particular kind of function called a generator
function, which will separate the statements into what is going to be on the __enter__
and __exit__
magic methods, respectively.
If, at this point, you are not familiar with decorators and generators, this is not a problem because the examples we will be looking at will be self-contained, and the recipe or idiom can be applied and understood regardless. These topics are discussed in detail in Chapter 7, Generators, Iterators, and Asynchronous Programming.
The equivalent code of the previous example can be rewritten with the contextmanager
decorator like this:
import contextlib
@contextlib.contextmanager
def db_handler():
try:
stop_database()
yield
finally:
start_database()
with db_handler():
db_backup()
Here, we define the generator
function and apply the @contextlib.contextmanager
decorator to it. The function contains a yield
statement, which makes it a generator
function. Again, details on generators are not relevant in this case. All we need to know is that when this decorator is applied, everything before the yield
statement will be run as if it were part of the __enter__
method. Then, the yielded value is going to be the result of the context manager evaluation (what __enter__
would return), and what would be assigned to the variable if we chose to assign it like as x
:—in this case, nothing is yielded (which means the yielded value will be none, implicitly), but if we wanted to, we could yield a statement that will become something we might want to use inside the context manager block.
At that point, the generator
function is suspended, and the context manager is entered, where, again, we run the backup code for our database. After this completes, the execution resumes, so we can consider that every line that comes after the yield
statement will be part of the __exit__
logic.
Writing context managers like this has the advantage that it is easier to refactor existing functions, reuse code, and in general is a good idea when we need a context manager that doesn't belong to any particular object (otherwise, you'd be creating a "fake" class for no real purpose, in the object-oriented sense).
Adding the extra magic methods would make another object of our domain more coupled, with more responsibilities, and supporting something that it probably shouldn't. When we just need a context manager function, without preserving many states, and completely isolated and independent from the rest of our classes, this is probably a good way to go.
There are, however, more ways in which we can implement context manager, and once again, the answer is in the contextlib
package from the standard library.
Another helper we could use is contextlib.ContextDecorator
. This is a base class that provides the logic for applying a decorator to a function that will make it run inside the context manager. The logic for the context manager itself has to be provided by implementing the aforementioned magic methods. The result is a class that works as a decorator for functions, or that can be mixed into the class hierarchy of other classes to make them behave as context managers.
In order to use it, we have to extend this class and implement the logic on the required methods:
class dbhandler_decorator(contextlib.ContextDecorator):
def __enter__(self):
stop_database()
return self
def __exit__(self, ext_type, ex_value, ex_traceback):
start_database()
@dbhandler_decorator()
def offline_backup():
run("pg_dump database")
Do you notice something different from the previous examples? There is no with
statement. We just have to call the function, and offline_backup()
will automatically run inside a context manager. This is the logic that the base class provides to use it as a decorator that wraps the original function so that it runs inside a context manager.
The only downside of this approach is that by the way the objects work, they are completely independent (which is a good trait)—the decorator doesn't know anything about the function that is decorating, and vice versa. This, however good, means that the offline_backup
function cannot access the decorator object, should this be needed. However, nothing is stopping us from still calling this decorator inside the function to access the object.
This can be done in the following form:
def offline_backup():
with dbhandler_decorator() as handler: ...
Being a decorator, this also has the advantage that the logic is defined only once, and we can reuse it as many times as we want by simply applying the decorators to other functions that require the same invariant logic.
Let's explore one last feature of contextlib
, to see what we can expect from context managers and get an idea of the sort of thing we could use them for.
In this library, we can find contextlib.suppress
, which is a utility to avoid certain exceptions in situations where we know it is safe to ignore them. It's similar to running that same code on a try/except
block and passing an exception or just logging it, but the difference is that calling the suppress
method makes it more explicit that those exceptions are controlled as part of our logic.
For example, consider the following code:
import contextlib
with contextlib.suppress(DataConversionException):
parse_data(input_json_or_dict)
Here, the presence of the exception means that the input data is already in the expected format, so there is no need for conversion, hence making it safe to ignore it.
Context managers are quite a peculiar feature that differentiates Python. Therefore, using context managers can be considered idiomatic. In the next section, we explore another interesting trait of Python that will help us write more concise code; comprehensions and assignment expressions.