Some Notes on Moving the Zone to Django
Fredrik Lundh | August 2007
Updated 2007-08-20: Added sections on caching and resource “minification”.
The effbot.org “zone” is a constantly growing collection of documents, most of them concerning various aspects of Python and related technologies. The zone software is also used for my PIL and Tkinter documentation, the online edition of my Python Standard Library book, and a couple of other document collections.
All in all, the zones at effbot.org currently contain around 2,000 documents, plus some 500 user comments.
Until now, the zone has been served as static HTML, generated and maintained using an increasinly disorganized collection of CGI scripts and off-site tools. Given that we moved pythonware.com to Django late last year, it’s about time I did the same to effbot.org.
Design #
My original plan was to use the old effbot.org templates pretty much as they were, but all recent talk about CSS frameworks (dead link), and how they’d ruin the field of web design by leaving it open for people who just cannot be bothered to learn everything there is to know about cross-platform CSS inanities, made me curious. I grabbed a copy of the Yahoo! User Interface Library, and went to work. I set up the following requirements:
- No radical redesign; the new site should look similar to the old one.
- Identical look for a majority of my visitors — in other words, for IE and Firefox.
- Scalable layout in both IE and Firefox. Most importantly, increasing the font size should work well (for people with reduced eye-sight or high-DPI displays).
- Support for 80-column wide code samples.
- The site should be fully usable with CSS disabled.
Implementing this on YUI was surprisingly easy. The current design use a 950px wide base layout, with a 180px navigation column to the left. The remaining space is split into two parts, with around 550px effective space (3/4) for the main column, and 180px (1/4) for a sidebar column to the right. Document blocks can be placed in the sidebar or the main column, or they can cover both columns. A combination of class names and render heuristics is used to handle block placement. Some source-code examples:
<p class="sidebar">This text goes into the sidebar.</p> <p>This goes into the main column.</p> <pre class="wide"> This wide code example extends into the sidebar. </pre>
The “wide” class is automatically added to wide PRE sections (currently, the limit is 55 characters). A wide column is wide enough to hold just over 80 columns of preformatted text.
Aiming for Typographic Perfection #
I experimented a little with Chad Miller’s SmartyPants implementation for Python in the final rendering step, but it was horribly slow — at least compared to the rest of the rendering chain. I ended up applying this during conversion to the intermediate format instead.
(I haven’t looked at the code, but it seems to me as if it should be possible to apply the SmartyPants algorithms to the existing Element tree, rather than having to serialize the document and then run it through SmartyPants’ home-grown HTML parser).
I’m also doing some microtuning in the rendering stage, including inserting hard spaces in titles to keep the last few words together, trimming off trailing whitespace in PRE CODE sections, etc.
Performance #
The effbot.org site isn’t exactly a high-traffic site, but since I’m interested in performance and scalability issues, I set up some basic requirements:
- Efficient data transfer to client (support for compression, validation/conditional requests, etc).
- Efficient caching
- Efficient dynamic rendering on cache misses
Django provides middleware for compression (GZipMiddleware), validation (ConditionalGetMiddleware), and caching (CacheMiddleware). The only problem was to apply them in the right order (see below for more on this).
Rendering required a bit more code. Zone sources are provided in a number of formats, but they’re primarily written in an HTML subset (just the <body> contents) and in Markdown with some Infogami-style extensions. Metadata is extracted from the source documents, which makes it easy to author zone documents in pretty much any text editor.
To speed up rendering, the new system converts from the source to an intermediate XML format when documents are added to the database (via a save override on the relevant model). The intermediate format is then turned into proper HTML and pushed through Django templates on the fly.
The intermediate format is basically an XHTML document, with the following extensions:
- Support for dynamic titles (fetched from the target on demand)
- Support for inline images.
- Support for local menus (based on headings on page).
- Support for intelligent links.
- Support for program code colorization.
Inline images are stored as data URI:s in the intermediate format (and usually also in the source documents), but since not all browsers support this format, the renderer replaces the data URI:s with HTTP pointers to an image cache directory. To distinguish between images, their MD5 hash is used.
Intelligent links are links that point to “target subjects”, rather than specific pages. For example, a link can point to the “subprocess” module, or even to the “insert” method of the “Text” class in the “tkinter” module. The renderer maps these to suitable on-site locations on the fly.
The current renderer can produce the equivalent of 2-5 A4 pages per millisecond, which means that most documents on the site render in a few milliseconds on a lightly loaded server, at the most. (FIXME: This is no longer true; for example, the YUI tree restructuring and code colorization is currently done during the final rendering step, and that processing is somewhat costly. I’ll move that to the preprocessing stage when I find the time.)
Some Performance Observations #
Middleware Order #
As noted above, getting the middleware order right can be tricky. Especially the ConditionalGetMiddleware component kept messing things up for me. With that middleware in the wrong place, the server kept locking up at times. After a few failed attempts, I finally came up with the following order:
MIDDLEWARE_CLASSES = ( 'django.middleware.common.CommonMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.middleware.cache.CacheMiddleware', 'django.middleware.http.ConditionalGetMiddleware', 'django.middleware.gzip.GZipMiddleware', # custom middleware follows... )
(This may be documented somewhere in the Django documentation, but I sure couldn’t find that place.)
I’m also using some non-standard middleware components, including my own multihost middleware, to co-host the development sandbox, online.effbot.org, and a few other sites on a single Django instance.
Accessing Large Database Records #
Both the source document and the intermediate XML version are stored as TextField:s in a big un-normalized table, which also holds associated data such as titles, publication date, etc. This design might be simple and practical, and un-normalized tables have plenty of performance advantages when the data is relatively static, but a drawback is that it’s rather expensive to pull out an object from this table via Django’s ORM; even if you’re only looking for a couple of metadata attributes, Django will fetch the entire row.
I was about to change the data model when I remembered the values method. This method tells the ORM what fields you’re interested in, and returns only those, in an ordinary dictionary.
query = SomeModel.objects.filter(key=value) for obj in query: do_something(obj.title) # slow
Becomes:
query = SomeModel.objects.filter(key=value) for obj in query.values("title"): do_something(obj["title"]) # fast
You obviously don’t get all the functionality of your model implementation when you use values, but you get the data you’re after without any extra overhead. With query sets containing hundreds of documents, some operations took well over a second without values, and less than ten milliseconds with it.
Django Caching #
I’ve played with both the file cache and memcached, and seem to get roughly the same performance with both backends (which is a bit surprising).
I’m currently using memcached; after all, it “should” be faster, and is definitely not slower. I’m a little concerned about memory usage, though. Given that I’m running this on a shared account, memory is somewhat tight; there’s a balance between how much memory you can spend on caching, and how much you can spend on Apache processes (see below).
Some useful resources:
- Gomez Instant Site Test lets you check performance from a few remote sites (or from thousands of them, if you’re prepared to pay for the service).
- The Firebug extension to Firefox can be used to check performance from your own systems, and a lot more — that and the Web Developer toolbar has helped me keep my sanity during deployment.
- The YSlow add-on can be used for additional performance analysis, including a simple “performance grade” analysis tool (effbot.org currently gets an “A”, python.org an “F” ;-)
Media Caching #
I had some difficulties getting proper caching for static media resources (which are served by a separate Apache instance). As it turned out, the copy of Firefox that I used for testing had caching turned off under about:config (d’oh!), but Apache didn’t produce proper headers either. Fixing the latter was almost as easy as fixing the former (and a lot easier to discover); adding the following .htaccess-file to the media root directory did the trick:
<IfModule mod_expires.c> ExpiresActive on ExpiresDefault "access plus 1 days" </IfModule>
With this and ConditionalGetMiddleware in place, the server now generates proper expiry- and cache-control headers, and validation requests work as expected.
(By the way, Mark Nottingham’s Cacheability Engine is a very useful tool for analyzing cache behaviour).
Server Crowding #
This kind of careful up-front design doesn’t help if you’re sharing your server with a bunch of Zope applications, though ;-)
The day before the new site went live, both the staging server and our other sites all slowed down to crawl, and even started to cause browser timeouts. top told me that a bunch of Zope instances and the mysql server kept hogging the CPU:s (with sustained loads in the 80-110% range), and the overall server load was often 10 or above. And when this happened, it took seconds to fetch a new page from my server. Tens of seconds, at times.
My hosting provider set up a new account on a less crowded server, but also spent some time tuning the current server; most importantly, they:
- Made sure that the front-end Apache server had enough processes available to handle everyone on the machine.
- Increased the ServerLimit setting for my private Apache instance (the one that’s running Django).
When I’m writing this, the old server is about as quick as a server can be. I haven’t ruled out a move, but it’s not very likely. We’ll see.
Resource Minification #
Reducing the number of external resources is a good way to eliminate some overhead. Since every server request has a cost, it’s a good idea to eliminate requests if you can.
The original design used four CSS files, and a few small JavaScript snippets for some unintrusive enhancements (see if you can spot them ;-). I’ve now combined the screen CSS files, and “minified” all resources.
For JavaScript minification, I use Baruch Even’s Python port of JSMin (available from that page). For CSS, I use a simple 10-minute hack, which does the following:
- Gets rid of comments
- Gets rid of whitespace, except between identifiers (this is way too aggressive for arbitrary CSS, but happens to work fine on my files — after some tweaks, at least)
- Replaces color names with hex codes and replaces 6-digit hex codes with 3-digit codes where possible.
- Combines my files with Yahoo’s CSS files (see above).
(if anyone is aware of a more robust Python solution for CSS minification, let me know.)
The result is a (currently) 7800-byte large CSS file (2200 when gzipped), and a 960-byte JavaScript file (480 when gzipped).