The Copilot service began experiencing problems on Thursday, January 31st, at approximately 11:50 PM EST. During this time, customers were not able to consistently use the Copilot service. Full service was restored at approximately 9:30 AM EST on February 1st. During this outage, our external monitoring system noticed the problem. Unfortunately, the SMS message did not make it to my phone.
There are two primary areas that we can focus on to quickly improve service:
- The Copilot reflector service will be configured to automatically restart upon a hard failure.
- An escalation path will be configured in our monitoring systems, as it is unlikely that SMS will fail for two individuals.