Configuring Varnish

At $WORK I’m currently working on deploying a pool of Varnish servers to sit in front of some Apache servers running Pressflow. On our current infrastructure we’ve been running Squid for the past few years with very good success , minus a hiccup or two along the way, one involving memory fragmentation (thank you tcmalloc). Varnish has a few nice features that Squid lacks.

  • The ability to PURGE objects using wildcards
  • Better support for multiple processors (Squid can benefit from multi procs when using AUFS)
  • Grace period that can be configured to serve objects from the cache after they’ve expired while fetching the new content from the backend. You can also use this to serve up stale content if your backend is down
  • Ships with a nice set of command line tools (varnishtop,varnishlog,varnishstat,varnishhist,etc…)
  • A very flexible scripting/configuration language (you can even do inline C if you’re feeling saucy) that allows you to manipulate the objects at any point in the request or response (See flow chart) There are many others but these are just a few off the top of my head and I’m still discovering what other capabilities Varnish has. The site has not gone live yet so I’m still testing on a dev version of the site and have not had an opportunity to perform any load testing yet. So far with my current working configuration I’ve made the following tweaks

  • Stripped cookies off static objects

  • Stripped Google analytics cookies

  • Removed empty cookies

  • Configured a graceful period to serve up stale objects from cache

  • Added a debugging header to show weather the object was a cache HIT or MISS The use of mod_expires on the Apache backend controls cache times for static assets (css,js,images,etc..). In my googling around when reading about Varnish I see a lot of people are setting cache times in their VCLs. IMO you should be letting the backend or application itself control the TTLs on objects. Within your application you can set more defined TTLs for certain sections of your site or even certain types of dynamic content without having to rely on complex VCL rules or deal with the deployment of the rules into Varnish. While Varnish does support a “graceful” style restart, its not quite as eloquant as doing service apache graceful. Kristian Lyngstol (one of the Varnish devs) has a good post on his blog on dealing with this. Also with the use of mod_expires you can set TTLs based on MIME-type within Apache.

One other thing I see a lot of people blindly recommending in configurations to deal with Varnish’s behavior of not caching cookies is to take the cookie value and add it into Varnish’s hash of the object. e.g.

sub vcl_hash { set req.hash += req.http.cookie; }

If a light bulb just went off in your head as to why this is a bad idea, kudos to you. What you’re basically doing is creating a cache per-user on your Varnish server. Your hit ratio will plummet from this config. There are scenarios where this can be used in a good way. In talking with some folks in #varnish on, a scenario where you’d want this is if say you had a cookie that was a display filter on your site or some sort of site customization that didn’t have a large number of combinations.

One thing that bothers me about Varnish currently is that it’s admin interface is completely unsecured. By default it listens on localhost but without any authentication, anyone with a shell on your Varnish box can bring down your Varnish instance or modify the config in anyway they feel fit. For those that allow dev’s on production servers to debug logs, this is a bit of a security concern. I’m not really sure of a workaround for this, so if anyone has any ideas, leave it in the comments below.

If you use Cacti for trending, there are some great templates available over at the cacti forums. They utilize a python script that needs access to the admin interface.

I’ll probably post some more in the future on Varnish as I do further reading and testing with it.