Tales of woe: Exchange, SAN and Smoothwall failures

Published by Matt Setchell on

What a week.

It all started last week, when our exchange server just decided to stop working. One of my techies was updating SSL certificates as Chrome is sending warnings that it is out of date. A fairly simple process – but it turns out changing the SSL certificate raised issues from where we removed the old exchange server incorrectly after decommissioning it when upgrading 2010>2013 – causing IIS not to work, so no activesync, no autodiscover, no OWA.

Luckily, once we worked out where the issue lay, a bit of digging in ADSI edit and IIS manager and issue resolved. Phew.

Onto the next challenge, one of the Volumes on our cluster is down to 9% free, some 200GB, it decreases a fair bit each week, on investigation I think it is out Smoothwall server, a new VM, which is now logging.

So the next job was to move Smoothwall to a dedicated machine. Two reasons – I was fairly sure the logging and requests would be hammering the SAN and ISCSI links, and also if I located the physical server near the Virgin box, it would reduce the traffic accross the network, as the VM box is in a different building.

As smoothwall was a VM on our 2008r2 cluster, I exported the server, and then tried to import and boot on the new 2012R2 Dell mini server I had got, only to find you can’t do that. For future reference, 2008r2 needs to either go 2008r2>2012>2012r2 – or forget exporting, and copy and paste the files directly.

A weird gotcha, but a lesson learnt. Box worked today, upgraded the RAM a bit tonight, and then moved into front server room, minimal downtime, but the move did mean a late night, 12:30 finish last night

On Friday the week before, triumphant after the hassle with exchange I was doing easy jobs – one was to get the serial of the SAN to enable a supplier to quote on a JBOD (as I say, cluster storage is filling up!). When logging in to the SAN, which admittidly I don’t do very often, I noticed it was ‘degraded’ – DotHill’s very speedy support identified a faulty controller, which they replaced today.

Only problem being, the controller they replaced (a refurbished product) has failed. Luckily we have dual controllers, so no immediate danger, but a very annoying issue none the less, awaiting a reply and replacement now.

So, an eventful week – but I am focusing on the positives > Exchange sorted quite quickly considering the weirdness, and now old server all gone. The SAN’s failover has worked, twice, the failover cluster, failedover and Smoothwall is on it’s own box, on a VM so can easily be moved again.

At least it’s not been boring!