Note, this blog entry and the link I share at the end are kind of long but highly worthwhile, especially if you want to become more familiar with the concept of an event-based processing model versus a prefork-based one and its benefits. This is applicable to both web servers and programming languages (Apache versus Nginx, PHP versus Node, etc.).
A recent personal story
Why high-concurrency is important
Within the span of two weeks, our little application was featured on both Techcrunch and Mashable. Our traffic ballooned to about 700%, and though our highly-optimized main application server didn’t even break a sweat, our corporate blog server went down for the count. With a basic WordPress setup on a micro EC2 Amazon instance (613mb virtual ram), it was running Apache with standard optimizations of caching headers and gzip.
By the time I had noticed, our blog server had already been running at 100% CPU for a while–yep, should have set an alert–and was very sluggish to browse. Restarting httpd and mysqld only momentarily alleviated the problem, as it instantly shot back up to 100% CPU with the load of a few hundred visitors banging down the door. My first split-second thought was to quickly migrate to a more powerful EC2 instance, one with more memory and CPU. It would be relatively easy and only take ten minutes to execute. After all, that’s what AWS was meant for, right? I knew that wasn’t the best answer, as AWS can become pricey, and one shouldn’t just throw more money at a problem. Our blog server should be able to withstand a spike of a few hundred requests on its current stack; otherwise, we’d pay for an unnecessarily large EC2 instance at the end of every month.
The core problem stemmed from Apache being machine-gunned with HTTP requests, five to ten from each user for things like the PHP page, images, JS, and CSS. These files use correct expires headers, so at most, users would request each file only once. However, there was a consistent flow of new visitors to keep the load high. I had been planning to move those assets to the CDN, but being a small company, we were all busy working on the actual application.
So what did I do? In short, I alleviated the issue in less than five minutes by installing and setting up Nginx on Port 80. It now intercepts and serves all static content and reverse proxies to Apache on a different port, for only the actual PHP page. As an extra step, I even disabled gzip on Apache and have Nginx do all the work. Despite the continued traffic, load decreased to almost pre-onslaught level even though the onslaught was still happening! That’s how amazing Nginx is.
The event-based model and why it is better than the traditional thread-based model
Before we dive into what makes the event-based model preferable, we need to talk about the problem of the traditional thread-based model used in most web servers and programming languages. This amazing writeup on Nginx internals that started the idea for this post explains it well:
…imagine a simple Apache-based web server which produces a relatively short 100 KB response—a web page with text or an image. It can be merely a fraction of a second to generate or retrieve this page, but it takes 10 seconds to transmit it to a client with a bandwidth of 80 kbps (10 KB/s). Essentially, the web server would relatively quickly pull 100 KB of content, and then it would be busy for 10 seconds slowly sending this content to the client before freeing its connection. Now imagine that you have 1,000 simultaneously connected clients who have requested similar content. If only 1 MB of additional memory is allocated per client, it would result in 1000 MB (about 1 GB) of extra memory devoted to serving just 1000 clients 100 KB of content. In reality, a typical web server based on Apache commonly allocates more than 1 MB of additional memory per connection, and regrettably tens of kbps is still often the effective speed of mobile communications.
Apache forking 240mb processes under load.
This is just one common scenario of low-bandwith devices where Apache or traditional programming becomes the bottleneck. Other scenarios are when threads are waiting for a DB query or accepting a file load from a user. Forget even trying to run things like web sockets with persistent connections in this scenario! In most of these situations, processes or threads are spun up where they spend most of their time just waiting for something else to finish. They are essentially blocked. (This is where the term “non-blocking” comes from when referencing Nginx or Node.)
Martin Fjordvald goes into better detail on how Apache works in his blog entry, a section of, which I display here:
Apache Prefork Processes:
- Receive PHP request, send it to a process.
- Process receives the request and pass it to PHP.
- Receive an image request, see process is busy.
- Process finishes PHP request, returns output.
- Process gets image requests and returns the image.
While handling the request, the process is not capable of serving another request. This means the number of requests you can serve simultaneously is directly proportional to the number of processes you have running. Now, if a process took up just a small bit of memory, that would not be too big of an issue, as you could run a lot of processes. However, the typical Apache + PHP setup has the PHP binary embedded directly into the Apache processes. This means Apache can talk to PHP incredibly quickly and without much overhead, but it also means that the Apache process is going to be 25-50MB in size. Not just for requests for PHP requests but also all static file requests. This is because the processes keep PHP embedded at all times due to cost of spawning new processes. This effectively means you will be limited by the amount of memory you have as you can only run a small amount of processes, and a lot of image requests can quickly make you hit your processes quota.
Note the bolded words above is exactly what caused our WordPress to go down. It wasn’t an overabundance of MySQL requests but all those assets.
Nginx, on the other hand, was built from the ground up in C to be non-blocking with the Reactor pattern. Martin’s describes Nginx’s event-based processing as such:
Nginx Event Based Processing:
- Receive request, trigger events in a process.
- The process handles all the events and returns the output
On the surface it seems fairly similar, except there’s no blocking. This is because the process handles events in parallel. One connection is not allowed to affect another connection even if run simultaneously. This adds some limitations to how you can program the web server, but it makes for far faster processing as one process can now handle tons of simultaneous requests.
So there you have it. I’d been meaning to write this post for a while, and I’m glad to be done! Though Node is the hottest thing on the block right now, its concept of building on the event-loop is not new. Nginx has been doing this since the ’90s. Back then, the Nginx creator was trying to solve the C10k or how to support 10,000 connections on one server. He knew the threaded model wasn’t it.
I know that at least one person will comment to solve my WordPress issue with “why not just run PHP through Nginx”. It’s possible but wouldn’t solve the problem, as PHP wasn’t written in a non-blocking manner. Should I switch to something like Node for the blog I wouldn’t even need Nginx, but WordPress is so damn easy to install and use for our less technical business and marketers that use it. Hence our situation is solved for now.