OK, that seems to have fixed it…

Over the last few days, unicorns regular readers[1] may have noticed this site disappearing from time to time. This caused me to get splinters from all the head scratching. Was it related to a recent update to the host the virtual server runs on? Was it some oddness in the file system? Was it that I was running an old OS which needed some serious updating? Well, no.

Here’s what was happening. The Apache web server is configured to allow up to 150 connections at the same time – for a small server like mine, this is fine, as connections are normally dropped as soon as the page or image (or whatever) has been delivered to the visitor’s browser. But something odd was going on. This is a graph showing the numbers of “workers” – basically the processes talking to web browsers:

What About the Workers

What About the Workers

See the peak towards the left? That’s a point at which there were 150 workers allegedly serving data. At that point, further connections couldn’t be made, visitors would be unable to load pages, and things would not be good.

The sudden drop occurred when I restarted the web server, which cleared all those dodgy connections, only for them to start rising again. And so on, with it taking a number of hours for the server to become unresponsive. Just the webby bit, that is – I was still able to log on to the console and type arcane commands.[2]

While this was going on, all those processes were using up more and more memory, leading to fun graphs like this:

Thanks for the Memory

Thanks for the Memory

Now a quiet site like this (and a few others that live on the server) really shouldn’t be eating all 4GB or RAM, and even causing the Swap partition to be used, so something was clearly wrong.

Digging into network connections with a combination of the netstat command and getting the Apache server status page up on a text browser on the server showed what was going on – a couple of IP addresses were persistently hitting a file on a WordPress site (not this one, oddly enough) in an attempt to either make it do something, or just a traditional denial of service attack. As the site in question doesn’t need the features that file provides, I’ve made it inaccessible to the outside world with an entry in the .htaccess file. I could have just blocked the IP addresses, but it struck me that the attack might just move on and come from other addresses.

Having done that last night, here’s what the graphs looked like this morning:

Workers Holiday

Workers Holiday

The sudden drop is when .htaccess started telling the “visitors” to go away. Now you can see some processes running on the server ready to deliver pages and a much smaller couple of wavy lines showing actual activity. Similarly, the memory graph looks a lot more sane:

Now I remember

Now I remember

So there it is. All good fun, or something. I hate to think how much time I spent on this, but a lot of that did involve an overdue OS upgrade, so never mind…

[1] One being more mythical than the other. I forget which….
[2] The best kind