Working with QuerySets and managers
Now that we have a fully functional administration site to manage blog posts, it is a good time to learn how to read and write content to the database programmatically.
The Django object-relational mapper (ORM) is a powerful database abstraction API that lets you create, retrieve, update, and delete objects easily. An ORM allows you to generate SQL queries using the object-oriented paradigm of Python. You can think of it as a way to interact with your database in pythonic fashion instead of writing raw SQL queries.
The ORM maps your models to database tables and provides you with a simple pythonic interface to interact with your database. The ORM generates SQL queries and maps the results to model objects. The Django ORM is compatible with MySQL, PostgreSQL, SQLite, Oracle, and MariaDB.
Remember that you can define the database of your project in the DATABASES
setting of your project’s settings.py
file. Django can work with multiple databases at a time, and you can program database routers to create custom data routing schemes.
Once you have created your data models, Django gives you a free API to interact with them. You can find the data model reference of the official documentation at https://docs.djangoproject.com/en/4.1/ref/models/.
The Django ORM is based on QuerySets. A QuerySet is a collection of database queries to retrieve objects from your database. You can apply filters to QuerySets to narrow down the query results based on given parameters.
Creating objects
Run the following command in the shell prompt to open the Python shell:
python manage.py shell
Then, type the following lines:
>>> from django.contrib.auth.models import User
>>> from blog.models import Post
>>> user = User.objects.get(username='admin')
>>> post = Post(title='Another post',
... slug='another-post',
... body='Post body.',
... author=user)
>>> post.save()
Let’s analyze what this code does.
First, we are retrieving the user
object with the username admin
:
user = User.objects.get(username='admin')
The get()
method allows you to retrieve a single object from the database. Note that this method expects a result that matches the query. If no results are returned by the database, this method will raise a DoesNotExist
exception, and if the database returns more than one result, it will raise a MultipleObjectsReturned
exception. Both exceptions are attributes of the model class that the query is being performed on.
Then, we are creating a Post
instance with a custom title, slug, and body, and set the user that we previously retrieved as the author of the post:
post = Post(title='Another post', slug='another-post', body='Post body.', author=user)
This object is in memory and not persisted to the database; we created a Python object that can be used during runtime but that is not saved into the database.
Finally, we are saving the Post
object to the database using the save()
method:
post.save()
The preceding action performs an INSERT
SQL statement behind the scenes.
We created an object in memory first and then persisted it to the database. You can also create the object and persist it into the database in a single operation using the create()
method, as follows:
Post.objects.create(title='One more post',
slug='one-more-post',
body='Post body.',
author=user)
Updating objects
Now, change the title of the post to something different and save the object again:
>>> post.title = 'New title'
>>> post.save()
This time, the save()
method performs an UPDATE
SQL statement.
The changes you make to a model object are not persisted to the database until you call the save()
method.
Retrieving objects
You already know how to retrieve a single object from the database using the get()
method. We accessed this method using Post.objects.get()
. Each Django model has at least one manager, and the default manager is called objects
. You get a QuerySet
object using your model manager.
To retrieve all objects from a table, we use the all()
method on the default objects manager, like this:
>>> all_posts = Post.objects.all()
This is how we create a QuerySet
that returns all objects in the database. Note that this QuerySet
has not been executed yet. Django QuerySets
are lazy, which means they are only evaluated when they are forced to. This behavior makes QuerySets
very efficient. If you don’t assign the QuerySet
to a variable but instead write it directly on the Python shell, the SQL statement of the QuerySet
is executed because you are forcing it to generate output:
>>> Post.objects.all()
<QuerySet [<Post: Who was Django Reinhardt?>, <Post: New title>]>
Using the filter() method
To filter a QuerySet
, you can use the filter()
method of the manager. For example, you can retrieve all posts published in the year 2022 using the following QuerySet
:
>>> Post.objects.filter(publish__year=2022)
You can also filter by multiple fields. For example, you can retrieve all posts published in 2022 by the author with the username admin
:
>>> Post.objects.filter(publish__year=2022, author__username='admin')
This equates to building the same QuerySet
chaining multiple filters:
>>> Post.objects.filter(publish__year=2022) \
>>> .filter(author__username='admin')
Queries with field lookup methods are built using two underscores, for example, publish__year
, but the same notation is also used for accessing fields of related models, such as author__username
.
Using exclude()
You can exclude certain results from your QuerySet
using the exclude()
method of the manager. For example, you can retrieve all posts published in 2022 whose titles don’t start with Why
:
>>> Post.objects.filter(publish__year=2022) \
>>> .exclude(title__startswith='Why')
Using order_by()
You can order results by different fields using the order_by()
method of the manager. For example, you can retrieve all objects ordered by their title
, as follows:
>>> Post.objects.order_by('title')
Ascending order is implied. You can indicate descending order with a negative sign prefix, like this:
>>> Post.objects.order_by('-title')
Deleting objects
If you want to delete an object, you can do it from the object instance using the delete()
method:
>>> post = Post.objects.get(id=1)
>>> post.delete()
Note that deleting objects will also delete any dependent relationships for ForeignKey
objects defined with on_delete
set to CASCADE
.
When QuerySets are evaluated
Creating a QuerySet doesn’t involve any database activity until it is evaluated. QuerySets usually return another unevaluated QuerySet. You can concatenate as many filters as you like to a QuerySet, and you will not hit the database until the QuerySet is evaluated. When a QuerySet is evaluated, it translates into an SQL query to the database.
QuerySets are only evaluated in the following cases:
- The first time you iterate over them
- When you slice them, for instance,
Post.objects.all()[:3]
- When you pickle or cache them
- When you call
repr()
orlen()
on them - When you explicitly call
list()
on them - When you test them in a statement, such as
bool()
,or
,and
, orif
Creating model managers
The default manager for every model is the objects
manager. This manager retrieves all the objects in the database. However, we can define custom managers for models.
Let’s create a custom manager to retrieve all posts that have a PUBLISHED
status.
There are two ways to add or customize managers for your models: you can add extra manager methods to an existing manager or create a new manager by modifying the initial QuerySet
that the manager returns. The first method provides you with a QuerySet
notation like Post.objects.my_manager()
, and the latter provides you with a QuerySet
notation like Post.my_manager.all()
.
We will choose the second method to implement a manager that will allow us to retrieve posts using the notation Post.published.all()
.
Edit the models.py
file of your blog
application to add the custom manager as follows. The new lines are highlighted in bold:
class PublishedManager(models.Manager):
def get_queryset(self):
return super().get_queryset()\
.filter(status=Post.Status.PUBLISHED)
class Post(models.Model):
# model fields
# ...
objects = models.Manager() # The default manager.
published = PublishedManager() # Our custom manager.
class Meta:
ordering = ['-publish']
def __str__(self):
return self.title
The first manager declared in a model becomes the default manager. You can use the Meta
attribute default_manager_name
to specify a different default manager. If no manager is defined in the model, Django automatically creates the objects
default manager for it. If you declare any managers for your model, but you want to keep the objects
manager as well, you have to add it explicitly to your model. In the preceding code, we have added the default objects
manager and the published
custom manager to the Post
model.
The get_queryset()
method of a manager returns the QuerySet
that will be executed. We have overridden this method to build a custom QuerySet
that filters posts by their status and returns a successive QuerySet
that only includes posts with the PUBLISHED
status.
We have now defined a custom manager for the Post
model. Let’s test it!
Start the development server again with the following command in the shell prompt:
python manage.py shell
Now, you can import the Post
model and retrieve all published posts whose title starts with Who
, executing the following QuerySet
:
>>> from blog.models import Post
>>> Post.published.filter(title__startswith='Who')
To obtain results for this QuerySet
, make sure to set the status
field to PUBLISHED
in the Post
object whose title
starts with the string Who.