Making Django and Rails Play Nice, Part 3: Caching

Friday, March 16, 2012, at 05:04AM

By Eric Richardson

This is the third section of a multi-part look at some of the issues that we faced in developing KPCC's new beta website, which is written in Ruby on Rails but runs side-by-side with the existing Django site. Part one looked at mapping generic relationships with MySQL views, while part two talked about adapting sessions to allow them to wander from one site to the other.

There are plenty of ways to do caching. Page caches, partial caches, timestamp-based caches, versioned caches... they've all got some place in valid place in the caching arsenal. In Rails, the convention is to build sweepers that get fired when model objects are saved, giving a spot to handle expiration or rebuilding of caches. But what do you do when Django needs to be the one to expire your Rails caches?

Django's default cache is built around memcache and expiration times, with a version number prefix that can be incremented to change keys and start over when content changes.

That was the setup when I first got to the station, but expiration-based caches have a number of consequences that are less than ideal. First, unless you are keeping track of versioning and updating that key prefix, you're always going to have a delay before the cache times out and updates show up on the site. Keep that timeout too low, though, and you'll expire caches unnecessarily. That creates a situation where most page views are fast, but every time the cache expires some unlucky visitor is going to get burned as we rebuild the entire page.

Last summer we switched that setup out, putting in a new system that attempts to do a cleaner job of expiring caches only when the content in them changes. Taking advantage of Redis and its sets data type, the system stashes a copy of the cache key in a set for each piece of content inside. When that piece of content is updated, the cache system expires every cache in that object's set.

While that move was made well before the plan to implement Rails, it turns out the system didn't need much change to power both systems at once. Make sure the cache keys are stored in full form with any framework-specific prefixing, and all of a sudden a content update in Django can expire caches put in place by Rails.

Listening in to the broadcast

Going a step further, what if you want a content update to trigger a cache rebuild, rather than leaving that action to trigger when a poor user hits an uncached view?

Again, Redis comes to the rescue. The database includes a Pub/Sub messaging system that allows an easy abstraction between publishers and subscribers.

In Django, I set up a post_save action on our content that triggers a PUBLISH action with a JSON object containing the content's unique key and some commonly-needed bits of metadata (what is the content's publish status? what action is this: a save, a publish, an unpublish?).

Using the worker listener in Resque as a model, I then set up a Rails process to sit and listen on that same channel. When it sees a change that should update the contents of the homepage, it fires a process that rebuilds our Sphinx caches and then caches new views.

Getting Fancy... What if you want Rails to cache for Django?

Want to get super-fancy? Use that Rails worker to cache back to Django.

It's actually not that hard, thanks to RubyPython:

if Rails.env == "production"
  RubyPython.start(:python_exe => "/usr/local/python2.7.2/bin/python")

@pickle = RubyPython.import("cPickle")

Now you have a Ruby interface to Python's Pickle, the important piece you need to write caches that Django will be happy to open:

# if we're passed a pickle object, also perform django headlines caching
if pickle
  (Rails.cache.instance_variable_get :@data).set(
    pickle.dumps( view.render(
        :partial => "home/cached/django/headlines", 
        :object => scored_content[:headlines]) 
    :raw => true


Rails Code

The Rails side of our caching code is available in the redis-content-store gem on SCPR's Github account. It includes the underlying caching code and simple view helpers for cache_content and register_content.

It's not perfect—my next step is to put some work into how nested caches would keep track of their content objects correctly—but it's working well for our purposes.

Django Code

The Django code can be found in this Gist: