Table of Content
- Optimize Django Queries with this Simple Trick
- The Problem with Django QuerySet with Related Fields
- How to Optimize Django QuerySet with Related Fields
- Wrap Off
This simple trick can help you optimize your database queries using the Django ORM.
To use this trick, you should keep in mind that the Django QuerySets are lazy. This feature about querysets in Django might be the best thing if used properly.
The Problem with Django QuerySet with Related Fields
For example, if we consider a model named Post
to hold blog posts and the following lines of code are executed:
posts = Post.objects.all()
published_posts = posts.filter(status='PUBLISHED')
When we run the code at this point, the Django ORM didn't touch the database yet, meaning no query was executed.
It will hit the database when we evaluate the QuerySet, most likely, when we start to iterate through the QuerySet, in a view or in the template, like in the following example:
<table>
<tbody>
{% for post in posts %}
<tr>
<td>{{ post.id }}</td>
<td>{{ post.title }}</td>
<td>{{ post.status }}</td>
</tr>
{% endfor %}
</tbody>
</table>
In the example above, only one database query will be executed.
However, the problem starts to appear when the model relates to other models through ForeignKey
, OneToOneField
, or ManyToManyField
.
Let's say our Post
model has ForeignKey
to a Category
model:
class Category(models.Model):
title = models.CharField(max_length=60)
class Post(models.Model):
title = models.CharField(max_length=60)
status = models.CharField(max_length=10)
category = models.ForeignKey(Category)
Now if you want to iterate through the published_posts
QuerySet as we did in the previous example in the template, but this time also displaying the category name, the Django ORM will execute an extra query for each row in the published_posts
QuerySet:
<table>
<tbody>
{% for post in published_posts %}
<tr>
<td>{{ post.id }}</td>
<td>{{ post.description }}</td>
<td>{{ post.status }}</td>
<td>{{ post.category.name }}</td>
</tr>
{% endfor %}
</tbody>
</table>
This query set process is bad.
It means that if the published_posts
QuerySet has 100 rows, this simple for loop
will execute 101 queries. It runs one query to retrieve the post objects, and an additional query for each post object to retrieve the category information.
A good way to keep the track of the number of executed queries is using the Django Debug Toolbar.
How to Optimize Django QuerySet with Related Fields
This undesired effect can be mitigated using the select_related
method to retrieve all the required information in a single database query.
So, instead of filtering the published posts like the first example, you may want to do it like this:
posts = Post.objects.all()
published_posts = posts.select_related('category').filter(status='PUBLISHED')
By doing this, the Django ORM will prefetch the category data for each post in the same query, which means there will be no need to run extra queries for this case.
This simple act can give a great performance increase for your application.
You can also learn more about the Django QuerySet API by reading the official Django Documentation.
Wrap Off
Django queryset might affect the performance of your application if not properly optimized especially when the model relates to other models through ForeignKey
, OneToOneField
, or ManyToManyField
.
Using select_related
will prefetch the category data for each post in the same query, which means there will be no need to run extra queries for this case.
If you learned from this tutorial, or it helped you in any way, please consider sharing and subscribing to our newsletter.
Please share this post and for more insightful posts on business, technology, engineering, history, and marketing, subscribe to our newsletter.