So I finally took a look at the queries I’m generating with Django. Wow, my queries suck. Here’s an example of the suckiness:
SELECT `django_site`.`id`,`django_site`.`domain`,`django_site`.`name` FROM `django_site` WHERE (`django_site`.`id` = 1)
SELECT `django_site`.`id`,`django_site`.`domain`,`django_site`.`name` FROM `django_site` WHERE (`django_site`.`id` = 1)
SELECT `django_site`.`id`,`django_site`.`domain`,`django_site`.`name` FROM `django_site` WHERE (`django_site`.`id` = 1)
Three times??
The Django Site module is a funny beast. It consists of a database table with columns id, name, and domain. Then in the settings file, there is a SITE_ID which is used to get the current site from the database.
Umm… Why not just put the site name and domain in the settings file? Why do the database query? Especially since each site already has a unique SITE_ID in the settings file? Is there something I’m missing here?
Brad rewrote this post as:
So there be this bitch of a module for the DJANGO that does site shit but it’s always be like querying the database for the same shits all overs the time over and over again and it be like doing nthing useful so I removed it and then it all did the same thing but without the site module so I recommend to you, by lloyalest readers, that you not to be using it plz k thx.
ThE DNDS.
16 Comments
Brad Fitzpatrick
Can’t believe you actually posted my über-intelligent 15 second version.
Jeff Croft
I think I can answer this one, but if someone smarter wants to come along and confirm or correct me, that’s cool too.
Django was originally built as part of Ellington (a commerical CMS that I worked on for a year and a half). Ellington (and therefore, Django), supports multiple sites being served from one database. The use case that inspired this is so:
A newspaper company (The Lawrence Journal-World, where I worked) has several media properties. For a partial example, they have a news site (LJWorld.com), an entertainment site (Lawrence.com), and a sports site (KUSports.com). Often, these sites will share the same content. A story may be on both LJWorld.com and Lawrence.com, or on KUSports.com and LJWorld.com, for example.
The sites module facilitates this. By a model (like, say, Story) having a ManyToMany relationship with sites, one story can belong to multiple sites. And, it’s easy for editors and reporters (i.e. non-technical folks) to assign them to the appropriate sites from the admin area.
That’s the long story. The short story is that if you only have one site (like, say, pownce.com), you don’t need (or probably want) to use the sites app.
Jacob Kaplan-Moss
Yup, Jeff pretty much nailed it.
As for why it’s a table: if it wasn’t a table then objects couldn’t (properly) refer back to it. Referential integrity’s extremely important to data cleanlieness; it would violate any number of best practices to have a site ID that didn’t refer back to a table.
Still, I’m curious about what situations your running into that generate so many queries against the sites table; that doesn’t happen anywhere in my code. I have to wonder if there might be a bug somewhere that causes all those extra queries; think you might be able to share the offending code so we can take a look?
Leah Culver
Jeff - thanks for the explanation. I can see how it was useful in that case, but now it seems a bit particular to content publishing applications. I hope that as Pownce grows, we might contribute our own stuff to Django (perhaps with a different slant).
Leah Culver
Jacob - I was using it in a template context processor so my designer wouldn’t hard-code the site name and domain. In fact, Pownce went by a code name during most of the development. Anyways, as soon as I pointed the context processor code to a static string in the settings file, the queries went away. When are the template context processors evaluated? Is it per-page and per-usage?
Mark Jaquith
I also like how it’s SELECTing django_site.id when it already has that information.
robert
you know those posts you wrote about computer science classes? i wish my school would have merged programming and sql courses, now i know how to do sql, and i know how to program but i don’t know how to make them work together well.
Arne
I tried to remove the sites module from my django project to but then the flatpages fallback middleware complained, because there is a dependency on the sites module …
Uros Trebec
Leah - I had the same problem and found a fairly simple solution…
I’ve added the two variables to the settings.py file
ie.:
SITE_NAME = ‘Something’
SITE_URL = ‘http://something.com’
and then import them into context_processors.py with:
from django.conf import settings.SITE_NAME, settings.SITE_URL
This works really well for one-site projects (the majority of my sites). While I do understand the usefulness of contrib.sites for larger multi-sites, I think there’s too much overhead for one-site projects.
BTW, great job with Pownce! I’ve just got my invite yesterday and have yet to get my friends to join, but I like it so far.
Jared
You know, you wouldn’t having this problem if you’d used Ruby on Rails. Sorry, I had to. Blame Canada? P.S. Hello from a fellow Minnesotan.
Jacob Kaplan-Moss
@Leah: ah, that makes sense — context processors are executed each time a context (er, a RequestContext, that is) is initialized; depending on how things are written multiple contexts could be created on each request.
Sounds like your fix was easy enough; it should also be trivial to modify the context processor to cache its data (it’s a good practice to cache things like context processors if possible since it can be hard to predict how often they’ll get called).
@Mark: you’re probably just trolling, but I’ll pretend you asked a question: when you do an ORM lookup in Django it selects all columns by default (including whatever columns you used in the lookup). That’s because the time it takes for a database to return all columns is almost identical to the time it takes to return a subset of columns. In fact, in some edge cases fewer columns can actually be slower.
Oh, and Django enumerates the columns explicitly because SELECT * is evil.
Bob Ippolito
I’m not sure what the request model looks like for Django, but we cache read-only queries like this in the request object so that it only gets fetched once per request. There’s also a tier of memcached between our webapps and the database, but making sure cached stuff doesn’t go stale is tricky.
Leah Culver
Bob and Jacob - thanks for the tip about caching the context processors! I’ll do that.
Matt
Django ate my baby!
(I had to say it)
Johnson Rice
Doubtfully a good time…
I’m just thinking the “Public” “Private” and “# Recipients” labels on each message should be subtly color coded so they stand out slightly more.
At least private messages should be… I’d like to know a little more clearly when a message is directed AT me specifically, or from me to someone specifically.
Although, that suggestion is probably ill timed and out of place here… Maybe.. I dunno. *shrug*
Duc Nguyen
Do you use any special IDE to write your python code? It seems outside of pydev or wingide, there aren’t really any good python IDE’s. Pydev and wingide aren’t that good anyways, so I’ve went back to using good old emacs to do the job.