Microcaching: Speed your app up 250x with no new code
I recently had the opportunity to help some friends out preparing a content site (wordpress) for a fairly hefty traffic hit. It was potentially going to be a big spike (national radio campaign, time sensitive content, etc) and they particularly didn't want it to go down at the critical time.
I put together a fairly typical "fast" PHP architecture: nginx, PHP-FPM, APC, front-end app cluster, load balancer, replicated DB, along with all the mess that comes with it - machine images, replicated filesystem, etc, etc. Additionally, installed/tested the various appropriate Wordpress Super-Hyper Cache Pro Blitzen 2000+ plugins.
After much mucking around, I had an awesome complicated, linearly scalable difficult to manage, app cluster that could scale to the stars very easily develop non-obvious bottlenecks.
It turns out that you can throw all of this out and replace it with a 23 line nginx config. Oh yeah, and you get a 250x per-node performance increase too.
How? Caching. But only a little bit. Here's a technique cheap trick that lets you get blinding performance and serve up-to-the-second fresh content, without having to write a whole bunch of app code.
Concept
Microcaching is like an insulation layer for your app - Let's say your wordpress install (or rails app) can handle 20 requests/sec fairly happily. This is fine, up until the point where you get on HN and Reddit at the same time (greatest day of your life) and right at the critical time, your site collapses spectacularly amidst the deafening snarky jeers of your peers.
The idea behind microcaching is to cap the amount of requests that can make it through to your app by letting nginx bear the brunt of your pageviews by caching content for a very small amounts of time (ie: 1 second or less).
From your app's point of view, it can only ever be hit by a maximum of 1 req/sec per page of content it needs to serve, so in wordpress terms, if everyone is hitting your front page or a specific post, the vast majority of requests can be served out of cache.
At the same time, the classic problem of stale content/cache invalidation is basically nil - nobody's going to realise if the content they're seeing is 800ms old. Probably...
Changing Data
Where you often come unstuck on something like a blog is with comments - you don't want a user who has just submitted a comment to then have it disappear when the page reloads. Thankfully, nginx's request handling is smart enough to deal with this - An nginx microcaching config:
The Config
There's nothing particularly clever going on here but it may be worth breaking down a couple of the entries -
if ($no_cache = "1") {
add_header Set-Cookie "_mcnc=1; Max-Age=2; Path=/";
} if ($http_cookie ~* "_mcnc") {
set $no_cache "1";
}
What we're saying is that if a request has been made that could modify the content (ie: a POST or PUT), add a special "no cache" cookie to the response, so that we know this user can't be served cached results for 2 seconds.
The other problem that's likely to occur with high traffic + caching is the thundering herd phenomena, wherein as a cache expires, all subsequent requests then require a real response to be generated. If you're doing 1000 cached reqs/sec and your dynamic page generation time is 200ms, then you're going to receive 200 more requests in the time it takes to refresh your cache, quite possibly taking down your server/app anyway.
Nginx has a way of dealing with this too:
proxy_cache_use_stale updating;
Ostensibly, this allows nginx to serve (slightly) stale responses whilst it's waiting for a refresh to complete. Neat.
Benchmarks
Enough with the talking - Let's see some benchmarks. The setup below is a clean EC2 small instance with a fresh unmodified wordpress install running on a pretty standard LAMP stack (Apache2 + PHP5 + PHP-FPM + APC + MySQL).
Vanilla wordpress, microcaching disabled, 200 requests, concurrency of 4 (conclusions below):
Woaah, 9.94 reqs/sec. That's pretty woeful. Of course, I could install Wordpress caching plugins, buy a bigger EC2 instance, tweak my PHP config, etc, etc. Or I could enable microcaching:
Nginx installed, microcaching enabled, 10,000 requests, concurrency of 500:
2364 reqs/sec. That's more like it - and this is on a single-core, contended cloud-server. From my quick tests, you can get about 7500 reqs/sec out of a 4 core box, which, to be honest, should be enough for anyone (ie: if you're getting 648 million pageviews per day, put some ads on your site and buy another damn server).
Some of you will notice that the attached config is actually broken for the WP-admin (tho easily fixed). Some of you will also notice that this is obviously not a silver bullet -
If you have personalized pages (ie: majority logged-in users) this approach isn't going to work. Similarly if you have a very write-heavy workload or long-tail of content, it's going to have reduced (but still useful) utility.
That said, for the amount of effort required for implementation, it's a nice insurance policy.