Squeezing Those Bytes

Thursday, December 03, 2009, at 05:25AM

By Eric Richardson

California Plaza Railroad
Eric Richardson

Overnight I managed to reduce the load size of blogdowntown's home page by 400kb without changing a single line of HTML. We went from a bloated 750kb to a much more svelte 340kb.

What could make that much of a difference? The size of our JPEGs. 660kb of that load was images, a number that's way, way too high. We do load 58 images, but I was still pretty floored to see the size of some of the smallish images we were loading up.

So what was the deal? Was I just being sloppy with my image compression? No, Flickr was.

blogdowntown's bdv4 codebase uses Flickr as a primary mechanism for photo handling. We upload shots there, and then drop the URL into the bdv4 Newsroom. From there it grabs the needed information directly from Flickr.

Turns out, Flickr images are anything but small. They're stuffed with EXIF data and compressed for best quality, not for quick loading. That makes sense for a photography site, but doesn't make sense when we're trying to serve up a dense home page.

For instance, consider the nice model railroad photo attached to this post. The 240x160 small version of the image tips the scale at 52kb. If I download it, open it in Photoshop and use Save for Web, default settings drop that to just 16kb.

Obviously I'm not going to do that, though. November was a light month, but we still published 195 images on 73 stories.

That doesn't mean I'm out of luck, though. Earlier this year I got frustrated with occasional Flickr downtime and decided that we needed to start caching copies of all the photos we use. I implemented that via a piece of code that grabs the Flickr images and uploads them to Amazon S3.

In that code, I use MiniMagick to create a 24x24 version of the Flickr square thumbnail. Outside of that, though, I was caching the images from Flickr untouched.

It was only a half-hour or so of work to figure out how to streamline the process, stripping EXIF and compressing the images before caching them to S3.

The result isn't as good as Photoshop, but it's meaningful. That 240x160 train? Just 20kb. For the 75px square thumbnail (a size we use all over the place), size drops from 36kb to 8kb.

For reference, here's the code involved, part of bdv4's ImgCache::Image class:


def cache
  if self.photo.cached?
    return nil

  uri = URI.parse(self.photo.photoBase)

  Net::HTTP.start(uri.host) {|http|
    # -- get square thumbnail
    sraw = http.get(uri.path + "_s.jpg")
    simg = MiniMagick::Image.from_blob(sraw.body)
    simg.quality QUALITY
        :access => :public_read,:foo => true

    # -- generate small square thumb
    simg.resize "24x24"
        :access => :public_read

    # -- get medium size
    raw = http.get(uri.path + ".jpg")
    img = MiniMagick::Image.from_blob(raw.body)

    img.quality QUALITY

    # save this at its current size
        :access => :public_read

    # resize for _m (max side 240px) and save
    img.resize "240x240"
        :access => :public_read

    # resize for _t (max side 100px) and save
    img.resize "100x100"
        :access => :public_read

  self.photo.imgcache = true

  return true