How I Survived My Last Few Fireballings

• Chris Liscio

It's a running joke in the community that you shouldn't be running Wordpress if you plan on surviving a sudden spike in traffic to your site. Heck, I've even been the butt of that joke before.

(Pretend that I linked a tweet from @gruber here, where he disses my server after it's been fireballed. Stupid crappy twitter searches…)

See, the solution to this problem is to use WP Super Cache, right? Unfortunately, my solution wasn't that simple. I still went down a few times after being fireballed. Here's my (long-winded, and probably slightly inaccurate) story below.

The First Time

When TapeDeck shipped, I was fireballed pretty hard. As it was happening, I panicked and upgraded my Slicehost slice from 256 to 512, because I ran out of RAM. Things seemed to go OK after that.

The Second Time

When I shipped Capo, John linked the site with some really kind words. The site was fine, but pretty slow as it was getting worked over.

After Capo 1.1 shipped, the site completely toppled over, despite that page being completely static. I didn't realize it at the time, but the DF link pointed directly at my product page as well as my blog.

I moved all the images off to S3 temporarily, which made the site much more bearable. All those image downloads were eating up available connections, and RAM.

It was at that point (I think) where I finally decided to install WP Super Cache, likely after Daniel Jalkut's repeated insistence.

The Third, and Final Time

Capo 2 shipped, and got another kind nod from Mr. Gruber. This time, the link was directly to my product page, and not the blog.

My response to this fireballing was painful, as the server wouldn't stay up long enough for me to try and fix the problems as they happened. Some stragglers must have been hitting my blog, which took the whole server down.

Something was definitely up, and no amount of tweaking my Apache server configuration files was working. I was bound to the (old, crappy) mpm_prefork server, which doesn't have a whole lot of helpful knobs to tweak. And, tweaking them while under (important!) load to your server is a bad scene.

The worst part is that a fireballing doesn't seem to happen in full force for a sustained period of time. So after you've banged on your config file for 10 minutes (with a few reboots in between), you think that everything's been fixed! Then you crash and burn again the next time Gruber has something nice to say about you…

Simulating A Fireballing

Now, I can't exactly email John and ask him to fireball me so I can test my site, so I needed a way to really beat the crap out of my server on my own time, when nobody was really looking.

That's when I finally got a copy of ab - the Apache HTTP server benchmarking tool installed on my slice. If it isn't already installed on your server, use the appropriate package manager to get your hands on it.

Running it once with a pretty fat setting (-n 1000 -c 100, or something I can't recall) took the server down right away. I finally had something that I could simulate heavy traffic with, and I could run it at will.

Tweaking Apache

I started up a bunch of terminals, each one monitoring important stats such as CPU load, memory usage, etc. Then, I tweaked Apache as best I could, trying to figure out the math to maximize performance while not running out of RAM.

One thing that I noticed after I had fully tweaked this setup, is that the entire site would be completely unresponsive while I ran the benchmark. The server survived, but it really sucked to use.

Ditching Apache

Finally I threw in the towel, and decided to finally kick Apache to the curb in favor of nginx. It's a bad-ass, fast-as-hell web server that I've been wanting to run since I first learned of it a few years ago. It's a bit of work to get your PHP stuff working properly with Wordpress, but I followed these instructions with pretty good success. (Your mileage will definitely vary, as I wasn't able to follow the steps verbatim).

I re-ran the benchmark, and things went much better. However, RAM was still going up quickly, and interacting with Wordpress was still pretty slow. Because of how I had everything set up, WP Super Cache was still serving pages via PHP, because I didn't have the Apache mod_rewrite equivalent set up yet in nginx.

After following these instructions, combined with applying changes specific to my site, my site ran like greased lightning. Running the stress tests never got my RAM usage up to 50%, and the CPU was barely breaking a sweat.

During the stress test, all the pages were responsive, and I kept pumping up the benchmark parameters until I got bored.

Surviving

Since I installed nginx, and pointed it at WP Super Cache, I survived a few more fireballings—two that pointed directly to my blog—without batting an eye. The pages seemed to be quite responsive as it was happening, and I didn't get any emails informing me that my server had died.

As I'm writing this, I'm working on implementing the new (and long-overdue!) design for this site, which I hope to roll out some time in the next few weeks. At least I seem to have a decent enough back-end to build it on, now…