Working with QuerySets and managers
Now that we have a fully functional administration site to manage blog posts, it is a good time to learn how to read and write content to the database programmatically.
The Django object-relational mapper (ORM) is a powerful database abstraction API that lets you create, retrieve, update, and delete objects easily. An ORM allows you to generate SQL queries using the object-oriented paradigm of Python. You can think of it as a way to interact with your database in a Pythonic fashion instead of writing raw SQL queries.
The ORM maps your models to database tables and provides you with a simple Pythonic interface to interact with your database. The ORM generates SQL queries and maps the results to model objects. The Django ORM is compatible with MySQL, PostgreSQL, SQLite, Oracle, and MariaDB.
Remember that you can define the database of your project in the DATABASES
setting of your project’s settings.py
file. Django can work with multiple databases at a time, and you can program database routers to create custom data routing schemes.
Once you have created your data models, Django gives you a free API to interact with them. You can find the model API reference of the official documentation at https://docs.djangoproject.com/en/5.0/ref/models/.
The Django ORM is based on QuerySets. A QuerySet is a collection of database queries to retrieve objects from your database. You can apply filters to QuerySets to narrow down the query results based on given parameters. The QuerySet equates to a SELECT
SQL statement and the filters are limiting SQL clauses such as WHERE
or LIMIT
.
Next, you are going to learn how to build and execute QuerySets.
Creating objects
Run the following command in the shell prompt to open the Python shell:
python manage.py shell
Then, type the following lines:
>>> from django.contrib.auth.models import User
>>> from blog.models import Post
>>> user = User.objects.get(username='admin')
>>> post = Post(title='Another post',
... slug='another-post',
... body='Post body.',
... author=user)
>>> post.save()
Let’s analyze what this code does.
First, we are retrieving the user
object with the username admin
:
>>> user = User.objects.get(username='admin')
The get()
method allows us to retrieve a single object from the database. This method executes a SELECT
SQL statement behind the scenes. Note that this method expects a result that matches the query. If no results are returned by the database, this method will raise a DoesNotExist
exception, and if the database returns more than one result, it will raise a MultipleObjectsReturned
exception. Both exceptions are attributes of the model class that the query is being performed on.
Then, we create a Post
instance with a custom title, slug, and body, and set the user that we previously retrieved as the author of the post:
>>> post = Post(title='Another post', slug='another-post', body='Post body.', author=user)
This object is in memory and not persisted to the database; we created a Python object that can be used during runtime but is not saved into the database.
Finally, we are saving the Post
object in the database using the save()
method:
>>> post.save()
This action performs an INSERT
SQL statement behind the scenes.
We created an object in memory first and then persisted it to the database. However, you can create the object and persist it to the database in a single operation using the create()
method, as follows:
>>> Post.objects.create(title='One more post',
slug='one-more-post',
body='Post body.',
author=user)
In certain situations, you might need to fetch an object from the database or create it if it’s absent. The get_or_create()
method facilitates this by either retrieving an object or creating it if not found. This method returns a tuple with the object retrieved and a Boolean indicating whether a new object was created. The following code attempts to retrieve a User
object with the username user2
, and if it doesn’t exist, it will create one:
>>> user, created = User.objects.get_or_create(username='user2')
Updating objects
Now, change the title of the previous Post
object to something different and save the object again:
>>> post.title = 'New title'
>>> post.save()
This time, the save()
method performs an UPDATE
SQL statement.
The changes you make to a model object are not persisted to the database until you call the save()
method.
Retrieving objects
You already know how to retrieve a single object from the database using the get()
method. We accessed this method using Post.objects.get()
. Each Django model has at least one manager, and the default manager is called objects
. You get a QuerySet object using your model manager.
To retrieve all objects from a table, we use the all()
method on the default objects
manager, like this:
>>> all_posts = Post.objects.all()
This is how we create a QuerySet that returns all objects in the database. Note that this QuerySet has not been executed yet. Django QuerySets are lazy, which means they are only evaluated when they are forced to. This behavior makes QuerySets very efficient. If you don’t assign the QuerySet to a variable but, instead, write it directly on the Python shell, the SQL statement of the QuerySet is executed because you are forcing it to generate output:
>>> Post.objects.all()
<QuerySet [<Post: Who was Django Reinhardt?>, <Post: New title>]>
Filtering objects
To filter a QuerySet, you can use the filter()
method of the manager. This method allows you to specify the content of a SQL WHERE
clause by using field lookups.
For example, you can use the following to filter Post
objects by their title
:
>>> Post.objects.filter(title='Who was Django Reinhardt?')
This QuerySet will return all posts with the exact title Who was Django Reinhardt?. Let’s review the SQL statement generated with this QuerySet. Run the following code in the shell:
>>> posts = Post.objects.filter(title='Who was Django Reinhardt?')
>>> print(posts.query)
By printing the query
attribute of the QuerySet, we can get the SQL produced by it:
SELECT "blog_post"."id", "blog_post"."title", "blog_post"."slug", "blog_post"."author_id", "blog_post"."body", "blog_post"."publish", "blog_post"."created", "blog_post"."updated", "blog_post"."status" FROM "blog_post" WHERE "blog_post"."title" = Who was Django Reinhardt? ORDER BY "blog_post"."publish" DESC
The generated WHERE
clause performs an exact match on the title
column. The ORDER BY
clause specifies the default order defined in the ordering
attribute of the Post
model’s Meta
options since we haven’t provided any specific ordering in the QuerySet. You will learn about ordering in a bit. Note that the query
attribute is not part of the QuerySet public API.
Using field lookups
The previous QuerySet example consists of a filter lookup with an exact match. The QuerySet interface provides you with multiple lookup types. Two underscores are used to define the lookup type, with the format field__lookup
. For example, the following lookup produces an exact match:
>>> Post.objects.filter(id__exact=1)
When no specific lookup type is provided, the lookup type is assumed to be exact
. The following lookup is equivalent to the previous one:
>>> Post.objects.filter(id=1)
Let’s take a look at other common lookup types. You can generate a case-insensitive lookup with iexact
:
>>> Post.objects.filter(title__iexact='who was django reinhardt?')
You can also filter objects using a containment test. The contains
lookup translates to a SQL lookup using the LIKE
operator:
>>> Post.objects.filter(title__contains='Django')
The equivalent SQL clause is WHERE title LIKE '%Django%'
. A case-insensitive version is also available, named icontains
:
>>> Post.objects.filter(title__icontains='django')
You can check for a given iterable (often a list, tuple, or another QuerySet object) with the in
lookup. The following example retrieves posts with an id
that is 1
or 3
:
>>> Post.objects.filter(id__in=[1, 3])
The following example shows the greater than (gt
) lookup:
>>> Post.objects.filter(id__gt=3)
The equivalent SQL clause is WHERE ID > 3
.
This example shows the greater than or equal to lookup:
>>> Post.objects.filter(id__gte=3)
This one shows the less than lookup:
>>> Post.objects.filter(id__lt=3)
This shows the less than or equal to lookup:
>>> Post.objects.filter(id__lte=3)
A case-sensitive/insensitive starts-with lookup can be performed with the startswith
and istartswith
lookup types, respectively:
>>> Post.objects.filter(title__istartswith='who')
A case-sensitive/insensitive ends-with lookup can be performed with the endswith
and iendswith
lookup types, respectively:
>>> Post.objects.filter(title__iendswith='reinhardt?')
There are also different lookup types for date lookups. An exact date lookup can be performed as follows:
>>> from datetime import date
>>> Post.objects.filter(publish__date=date(2024, 1, 31))
This shows how to filter a DateField
or DateTimeField
field by year:
>>> Post.objects.filter(publish__year=2024)
You can also filter by month:
>>> Post.objects.filter(publish__month=1)
And you can filter by day:
>>> Post.objects.filter(publish__day=1)
You can chain additional lookups to date
, year
, month
, and day
. For example, here is a lookup for a value greater than a given date:
>>> Post.objects.filter(publish__date__gt=date(2024, 1, 1))
To lookup related object fields, you also use the two-underscores notation. For example, to retrieve the posts written by the user with the admin
username, use the following:
>>> Post.objects.filter(author__username='admin')
You can also chain additional lookups for the related fields. For example, to retrieve posts written by any user with a username that starts with ad
, use the following:
>>> Post.objects.filter(author__username__starstwith='ad')
You can also filter by multiple fields. For example, the following QuerySet retrieves all posts published in 2024 by the author with the username admin
:
>>> Post.objects.filter(publish__year=2024, author__username='admin')
Chaining filters
The result of a filtered QuerySet is another QuerySet object. This allows you to chain QuerySets together. You can build an equivalent QuerySet to the previous one by chaining multiple filters:
>>> Post.objects.filter(publish__year=2024) \
>>> .filter(author__username='admin')
Excluding objects
You can exclude certain results from your QuerySet by using the exclude()
method of the manager. For example, you can retrieve all posts published in 2024 whose titles don’t start with Why
:
>>> Post.objects.filter(publish__year=2024) \
>>> .exclude(title__startswith='Why')
Ordering objects
The default order is defined in the ordering
option of the model’s Meta
. You can override the default ordering using the order_by()
method of the manager. For example, you can retrieve all objects ordered by their title
, as follows:
>>> Post.objects.order_by('title')
Ascending order is implied. You can indicate descending order with a negative sign prefix, like this:
>>> Post.objects.order_by('-title')
You can order by multiple fields. The following example orders objects by author
first and then title
:
>>> Post.objects.order_by('author', 'title')
To order randomly, use the string '?'
, as follows:
>>> Post.objects.order_by('?')
Limiting QuerySets
You can limit a QuerySet to a certain number of results by using a subset of Python’s array-slicing syntax. For example, the following QuerySet limits the results to 5 objects:
>>> Post.objects.all()[:5]
This translates to a SQL LIMIT 5
clause. Note that negative indexing is not supported.
>>> Post.objects.all()[3:6]
The preceding translates to a SQL OFFSET 3 LIMIT 6
clause, to return the fourth through sixth objects.
To retrieve a single object, you can use an index instead of a slice. For example, use the following to retrieve the first object of posts in random order:
>>> Post.objects.order_by('?')[0]
Counting objects
The count()
method counts the total number of objects matching the QuerySet and returns an integer. This method translates to a SELECT COUNT(*)
SQL statement. The following example returns the total number of posts with an id
lower than 3
:
>>> Post.objects.filter(id_lt=3).count()
2
Checking if an object exists
The exists()
method allows you to check if a QuerySet contains any results. This method returns True
if the QuerySet contains any items and False
otherwise. For example, you can check if there are any posts with a title
that starts with Why using the following QuerySet:
>>> Post.objects.filter(title__startswith='Why').exists()
False
Deleting objects
If you want to delete an object, you can do it from an object instance using the delete()
method, as follows:
>>> post = Post.objects.get(id=1)
>>> post.delete()
Note that deleting objects will also delete any dependent relationships for ForeignKey
objects defined with on_delete
set to CASCADE
.
Complex lookups with Q objects
Field lookups using filter()
are joined with a SQL AND
operator. For example, filter(field1='foo
', field2='bar')
will retrieve objects where field1
is foo and field2
is bar. If you need to build more complex queries, such as queries with OR
statements, you can use Q
objects.
A Q
object allows you to encapsulate a collection of field lookups. You can compose statements by combining Q
objects with the &
(and), |
(or), and ^
(xor) operators.
For example, the following code retrieves posts with a title that starts with the string who or why (case-insensitive):
>>> from django.db.models import Q
>>> starts_who = Q(title__istartswith='who')
>>> starts_why = Q(title__istartswith='why')
>>> Post.objects.filter(starts_who | starts_why)
In this case, we use the |
operator to build an OR
statement.
You can read more about Q
objects at https://docs.djangoproject.com/en/5.0/topics/db/queries/#complex-lookups-with-q-objects.
When QuerySets are evaluated
Creating a QuerySet doesn’t involve any database activity until it is evaluated. QuerySets will usually return another unevaluated QuerySet. You can concatenate as many filters as you like to a QuerySet, and you will not hit the database until the QuerySet is evaluated. When a QuerySet is evaluated, it translates into a SQL query to the database.
QuerySets are only evaluated in the following cases:
- The first time you iterate over them
- When you slice them, for instance,
Post.objects.all()[:3]
- When you pickle or cache them
- When you call
repr()
orlen()
on them - When you explicitly call
list()
on them - When you test them in a statement, such as
bool()
,or
,and
, orif
More on QuerySets
You will use QuerySets in all the project examples featured in this book. You will learn how to generate aggregates over QuerySets in the Retrieving posts by similarity section of Chapter 3, Extending Your Blog Application.
You will learn how to optimize QuerySets in the Optimizing QuerySets that involve related objects section in Chapter 7, Tracking User Actions.
The QuerySet API reference is located at https://docs.djangoproject.com/en/5.0/ref/models/querysets/.
You can read more about making queries with the Django ORM at https://docs.djangoproject.com/en/5.0/topics/db/queries/.
Creating model managers
The default manager for every model is the objects
manager. This manager retrieves all the objects in the database. However, we can define custom managers for models.
Let’s create a custom manager to retrieve all posts that have a PUBLISHED
status.
There are two ways to add or customize managers for your models: you can add extra manager methods to an existing manager or create a new manager by modifying the initial QuerySet that the manager returns. The first method provides you with a QuerySet notation like Post.objects.my_manager()
, and the latter provides you with a QuerySet notation like Post.my_manager.all()
.
We will choose the second method to implement a manager that will allow us to retrieve posts using the notation Post.published.all()
.
Edit the models.py
file of your blog
application to add the custom manager, as follows. The new lines are highlighted in bold:
class PublishedManager(models.Manager):
def get_queryset(self):
return (
super().get_queryset().filter(status=Post.Status.PUBLISHED)
)
class Post(models.Model):
# model fields
# ...
objects = models.Manager() # The default manager.
published = PublishedManager() # Our custom manager.
class Meta:
ordering = ['-publish']
indexes = [
models.Index(fields=['-publish']),
]
def __str__(self):
return self.title
The first manager declared in a model becomes the default manager. You can use the Meta
attribute default_manager_name
to specify a different default manager. If no manager is defined in the model, Django automatically creates the objects
default manager for it. If you declare any managers for your model but you want to keep the objects
manager as well, you have to add it explicitly to your model. In the preceding code, we have added the default objects
manager and the published
custom manager to the Post
model.
The get_queryset()
method of a manager returns the QuerySet that will be executed. We have overridden this method to build a custom QuerySet that filters posts by their status and returns a successive QuerySet that only includes posts with the PUBLISHED
status.
We have now defined a custom manager for the Post
model. Let’s test it!
Start the development server again with the following command in the shell prompt:
python manage.py shell
Now, you can import the Post
model and retrieve all published posts whose title starts with Who
, executing the following QuerySet:
>>> from blog.models import Post
>>> Post.published.filter(title__startswith='Who')
To obtain results for this QuerySet, make sure to set the status
field to PUBLISHED
in the Post
object whose title
starts with the string Who.