Philip Jacob

Caching, Illustrated

· Philip Jacob

Let’s just say that you have some friends (who might be called Chris and Mary) who are starting a new radio show (let’s call it Open Source) who put a link to a 54Mb mp3 on their website.

And let’s just suppose that another guy who runs a really popular website called Scripting News makes a link from his website to this big mp3 file. What do you think will happen?

Well, in a word, this will happen:

Bandwidth usage before and after mod_expires was enabled

Since the Open Source website runs on my infrastructure, it’s up to me to make sure that things keep working, even in circumstances like this. Now, I’ve done lots of working with caching (in the application layer and via HTTP) in the past, making slow websites fast and popular websites usable again. It’s not magic, but there are a lot of tricks that you can play with a standard Apache configuration in order to get the most mileage out of your hardware.

In this case, since mod_expires wasn’t enabled for my client, I switched it on. Doing so reduced the bandwidth consumption since lots of people started getting the content from the ISP’s proxy caches instead. This gets content to the users faster and reduces load on my servers.

Thought it was a nice graph that illustrated the benefit of knowing how to configure your webserver properly. The maintenance activity at ~2am required a quick bounce to Apache that lasted less than a second. And you’ll notice that the bandwidth usage dropped considerably after the mod_expires was configured and turned on.