The intermittent Kiln errors posted above have been resolved by rolling back the leaked generation. We'll resume deployment once the problem's root cause has been tracked down and fixed. We apologize for the inconvenience -- please contact Support if you have any further issues.
A new generation of Kiln, currently leaked to 10% of our customers, began throwing intermittent 502 and 503 errors this morning at about 10:55 EST. We're working on stabilizing the errant application pool and will post updates as they become available.
We will be preemptively moving tempdb to a dedicated drive on our master database server on Sunday, between 04:00 and 06:00 GMT. This requires a restart of SQL Server. Actual downtime should be brief but may occur at any time during the maintenance window.
Starting at approximately 1:15 PM today, several of our customers started experiencing intermittent errors with Kiln where they were getting 500 errors on push and/or getting "Kiln has overheated" errors when using the Kiln UI. The problem was caused by one of our web servers. That particular web server has been taken out of the pool and it looks like the problem has gone away. We are going to continue to look in to what caused the server to fail and how we can prevent this problem from occurring again in the future.
The delayed email and indexing issue above has been resolved. An account in a particular state was causing the update service to fail unexpectedly. We've removed that account from the service and are working to determine exactly what was broken and why it was breaking so dramatically.
In the meantime, updates are now being processed for all FogBugz On Demand accounts, but delays may linger for the next hour or so as everything gets caught up.
Due to a failure in one of our asynchronous update processes, about half of our FogBugz On Demand accounts are experiencing delayed emails (inbound and outbound) and indexing. We're actively working on the issue and will post updates as they become available.
Around 10:20 AM EDT, and lasting for less than five minutes, we started to get reports of 503 System Unavailable messages for some FogBugz On Demand accounts. This was due to an erroneous fix that was pushed to one of our servers. The fix was backed out in less than five minutes and will be re-evaluated before retrying.
Most customers are reporting no loss of in-progress edits.
One of our On Demand database servers went offline shortly after 11 AM EST today. No data has been lost; the system is doing fine and is coming back online now. If you are affected by this outage, you will likely see an error message rather than the login page for your account. We expect all databases to be back online within the hour.
We will be investigating why the system shut down in the coming week and will post a follow-up entry with details and a resolution.