Falling Out Of Love With Siteground

Ben Tasker

2014-04-13 12:27 (updated 2014-05-03 11:51)

In the past, I've really rated Siteground Hosting very highly, and recommended them to anyone asking about US Based dedicated servers (Heart would be my first choice for UK Based Dedicated Servers or VPS). Unfortunately experience has worn me down.

To be clear, I'm not, and never have been, a Siteground customer. However, some of the people I do some work for are, so I occasionally have to escalate things to Siteground, or step in when Siteground have asked their customer to take some action.

I've been quietly sitting on some of these frustrations for a little while, but in the last week some have been added, tipping the balance in my mind.

The First thing they did right

The point of this post isn't to slate Siteground, and to be fair to them, the very first thing they did that impressed me is something they still appear to do. When you order a new Dedicated server, it's well shielded from the Internet.

If you want SSH or WHM access, you'll have to ask them to add your IP to the firewall. You'll then need to upload your RSA public key as password authentication is disabled for SSH. Some would find this an inconvenient and frustrating process to go through, but frankly it brightened my day to see a host taking security seriously.

They do actively monitor the systems too, quite some time back I recompiled Apache on a server (but forgot to tell them), so they phoned their customer up at 4am (sorry, my bad) to say someone in the UK was recompiling Apache, did he know about it? I'd had to do it because Apache had been compiled without Large File Support which was causing Akeeba to fail.

There are also other things they do well (their support is prompt and polite), unfortunately the negatives cause me some real concern.

The Negatives

These issues have arisen on different servers, with different support staff, and some of them concern the Sysadmin within me, others are more a 'customer service' type of thing.

Geeky and Performance Bundles

It's been my experience that taking these two bundles together is a big mistake. There's nothing wrong with the technology they're deploying (generally speaking, a Varnish cache on port 80 with Apache listening on another port - the performance bundle), but if you hit any kind of unusual issue, you're likely to be dead in the water.

The problem is two-fold, if you're on the Geeky bundle then it's a fully managed server, and they won't give you the root password. That in itself isn't a major issue (though I wouldn't opt for it myself), until you hit the support issue.

The level of investigation into your issue may not be nearly thorough enough, and you'll be entirely reliant on this to get the issue resolved. Because you don't have the root password, investigation can be difficult if you bring a third party (like me) in to try and get the problem resolved.

The issue I experienced had the following symptoms

Users were receiving intermittent 503's on some URLs
The Apache logs showed the content had been served to Varnish with a 200
The requests were always being passed from Varnish onto the Apache origin

The customer was told it was likely an issue with their site, and pretty much left to try and work it out for themselves.

After testing, and monitoring the system as best I could as an unprivileged user, I found that the cause was the Content-Length header.

If an extension set this header (i.e. 'Content-Length: 500') but the response was subsequently gzipped, Varnish would give a HTTP 503 (as it was expecting 500 bytes of content and got 300, for example).

It was, technically an issue with extensions on the site, but from a customer service point of view: how many customers could have worked that out? And Siteground should have picked up on the fact that the issue was occurring somewhere in the communication between Varnish and Apache.

The best fix we could come up with between us was to unset the Content-Length header in .htaccess.

I'm more of an NGinx guy than a Varnish guy, so was relying on SG for the more advanced stuff, but they couldn't think of another way around the issue either (aside from working through an entire Joomla site looking for extensions that use Content-Length).

There have been any number of issues with Varnish on that system, and frankly at times it's felt like I've had to hold Siteground's hand to get to the root cause. Even simple things like the cache hit rate being 0% (i.e. Varnish is giving no benefit), after 172 days of uptime.

Is it unreasonable, if the customer is paying for a performance booster, to ask that they occasionally check that the deployed software is working as intended? Perhaps I'm asking too much, but my view is that if you are specifically selling a piece of software as an 'improvement' to the customer's service, you should be pretty hot on supporting it.

Investigating Signs of a Compromise

This one made the Sysadmin in me want to go and scream in a cupboard.

Siteground evidently received a report of this server sending spam, including details of an example mail. They jumped onto the server, checked the sent mail, disabled all sites via a .htaccess file containing 'deny all' and then notified the customer. That's it (I checked the history...).

Disabling the sites is something I'd have done too, but my investigation would never have stopped there. The example mail they included in their notification had the following within its headers

SCRIPT_FILENAME=/home/changedusername/public_html/modules/mod_articles_latest/getinfoljy.php 
REQUEST_URI=/modules/mod_articles_latest/getinfoljy.php
PWD=/home/changedusername/public_html/mod_articles_latest

So it's patently obvious that the site has been compromised. Even if we leave aside the fact that anyone as involved with Joomla! as Siteground is should know that file isn't part of the core mod_articles_latest module, it's obvious that the sending of the mail has been triggered via a web request.

Yes, the customer's site has obviously been compromised, and they need to sort that out. But you know what? An attacker has been able to execute arbitrary code on a server that you are supposed to be managing.

Is it unreasonable to think that a Sysadmin might like to check whether the attacker has made any attempts (or worse, succeeded) at privilege escalation? Or to identify whether /etc/shadow and /etc/passwd might have been downloaded?

From what I could see, there was absolutely no attempt made to asses the scale of the compromise.

Worse, the spam mail didn't seem to have been properly looked at;

When I asked Siteground on ticket whether rotated logs were archived (quicker to ask than check sometimes) they responded to say the exim logs weren't retained once rotated out.

I could have been clearer in my question, but the response gave the distinct impression that they hadn't even realised that the spam run was triggered via a web request. Which would mean that, through failing to analyse the data at hand, they were trying to handle a potential compromise without a clear understanding of the method, mechanism or nature of the compromise.

If there's one time you need support to be skilled and efficient, it's when investigating a known or suspected compromise. The attitude in this case seems to have been "server has been sending spam, shut the sites down and let the customer sort it out".

We can argue that the web stack is the customer's responsibility, but what sysadmin doesn't check that a compromise of the webstack hasn't led to a compromise of the OS (Even Script Kiddies have access to known root exploits too, after all).

Re-Using Servers

A couple of times, I've been asked to look into a few things on Siteground servers, and found signs of domains that the customer doesn't and has never owned. The first time I found it, my first thought was that the server had been compromised and was hosting an additional domain.

A little investigation showed this wasn't the case. The files I'd found (old .htaccess backups, mail directories etc) predated the customer taking over that server, sometimes by as little as a few hours.

The only explanation that I've been able to come up with, is that Siteground aren't re-imaging between customers. It looks for all the world like they've logged onto a decommissioned server, attempted to clear out the home directory and renamed the user account.

If so, that's absolutely unforgivable. Whilst I'd always go through and remove files before I stop using a server, I'd expect that the host would perform a re-image to ensure that none of my (or worse, my customer's) data remains on the server. As their customer, it's not an unreasonable expectation, I have no control over who the next user will be and any traces of my data should be purged.

Siteground were (of course) asked about this at the time, and the explanation they gave was that the mail must have been migrated when the (real and valid) domains were migrated to the server. I keep detailed logs of migrations, and this definitely wasn't the case, no mail was migrated (and in point of fact, one of the old servers didn't even have an MTA installed and didn't use CPanel so everything was moved manually). Not to mention that those domains weren't hosted on the old server (or the domain names owned by the customer)

The old .htaccess files aren't the biggest deal, but the mail directory (sometimes with gigabytes of mail still in cur and new) could be very revealing should the next user decide to trawl through it. And what happens if a database is missed?

The best way to be safe and sure, is to re-image between customers. With the correct processes in place, it doesn't take long and can be almost completely automated - in fact once set up it probably takes longer to go in and remove the data manually.

These may have been isolated incidents, however, as I quite quickly moved on to check other Siteground servers for the same thing and found no traces.

Conclusion

Some of the shine has definitely gone from my earlier impressions of Siteground's hosting. Their prices aren't bad and they use solid kit (from what I've seen), but I can honestly say I really wouldn't want to rely on their support when things go wrong.

I'd certainly never be willing to take out a dedicated server with them under the Geeky bundle, not having root access can really prolong downtime if you need to investigate an issue that, frankly, they should be looking further in to.

The issue with not re-imaging servers is, in my opinion, negligent and unforgivable. Customer data is the most precious thing a business has to protect, and as hosts we should be ensuring that our processes help protect both our customers and their customers where possible. These incidents were avoidable.

My experiences may well be isolated incidents, there've been a few of them, but it's still likely to be a tiny volume in the number of support requests they get, and we all make mistakes from time to time. Their support team are prompt and polite, it's just that they don't seem to be quite as hands on as I'd expect them (or perhaps it's a separate team, I've seen references to their server team in tickets) to be.

All that said, their 'locked down' by default policy is to be commended, and I wish more webhosts would take that approach.