As I mentioned in the previous post, we've taken the first steps in replacing the load balancing layer that stands between you and our services. There was not any measurable downtime during this phase, and we will do our best to keep it that way for you all in the coming weeks as we continue the transition.
I'll post more info on our plans as soon as we've worked out the details. For now, I would expect another small maintenance period within the next two weeks.
We have recovered from the previous outage, which left services in an intermittent state for nearly half an hour. If this impacted your day-to-day operations, please contact us so we can make it right.
The problem itself was caused by a malfunctioning load balancer, which has been taken out of service and replaced by its warm spare. Due to a string of recent outages, we will begin replacing this layer this Sunday at 05:00 GMT. As we have already been testing new load balancers for select sites, we do not anticipate any customer impact: the switchover will take less than a second. After the transition, we will be watching things closely to help ensure that there is not further downtime.
Between 7:00 and 7:45 AM GMT on 3/6/2010, FogBugz and Kiln On Demand experienced a 45 minute period of intermittent access issues. It was very sparse to begin with, but was a show stopper for many by 7:39 AM.
It took some time to track down, but we have linked this problem to a definite bug in our load balancing platform and have taken steps to remedy.
We apologize for any inconvenience this may have caused you, and appreciate your patience.