Follow Me on Twitter
Client Support Community Server Status Contact Us Client Login
Email Hosting Website Hosting Reseller Hosting VPS Hosting Dedicated Servers

    Join our Community      Check your private messages       Profile       Search       FAQ       Memberlist       Log in


[30/01/09] Neptune downtime

 
Post new topic   Reply to topic    NetHosted Community Index -> Technical Announcements
NetHosted - Andrew Reply with quote
 NetHosted Staff

 

 Joined: 22 Mar 2004
 Posts: 7017
 

PostPosted: Fri Jan 30, 2009 3:33 pm    Post subject: [30/01/09] Neptune downtime
 
Hi,

As many of your are aware there was a downtime on Neptune this morning. This took a while to be picked up on due to two main issues:

1) The tech supposed to be online at this time wasn't. This has been dealt with internally.

2) Both monitoring systems we have (our own, and a third party) didn't pick up on the downtime as explained below.

This downtime appears to have been caused by a firmware bug in the RAID card on this server. While under heavy I/O load the RAID card started to queue commands instead of actioning them leading to a situation where it basically gave up due to the number of queued commands. This caused the filesystems to be mounted as read only and this is what caused the two monitoring systems not to pick up on the downtime. They both check for the availability of services not what the services actually return. So although httpd was down as far as you were all concerned (as it was showing forbidden) as far as the monitoring software was concerned it was up as it was still responding.

We control one of these monitoring solutions and on this changes will be made to pick up on this specific edge case.

Tonight action will be taken to fix this bug with the RAID card. We may also re-flash the BIOS as there is one potential bug with our current version that may have also contributed to this. There will be a small amount of downtime while these processes are taking place.

Please accept our apologies for this unusual downtime and rest assured we are putting in place multiple measures to stop this from happening again (including a certain member of staff investing in multiple alarm clocks).

Thanks,

Andrew

_________________
| Andrew Bassett
| Managing Director, NetHosted Ltd.
| Follow us on Twitter: http://twitter.com/nethosted 
| Members, tell us what you think  of NetHosted!
Back to top
View user's profile Send private message
Post new topic   Reply to topic    NetHosted Community Index -> Technical Announcements
Page 1 of 1

User Permissions
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum

 
Jump to: