We sincerely apologize for the disruption this caused. We understand the importance of service availability and deeply regret any inconvenience to our users and teams.
The issue was resolved by migrating workloads to a healthy datacenter and restoring dependencies, including CosmosDB and Redis.
✅ Manual failover of services – Migrated critical workloads out of the affected AKS cluster.
✅ Telemetry rerouting – Reconnected and validated monitoring systems in the alternate region.
✅ Redis instance recovery – Identified and resolved a stuck disk allocation that prevented Redis failover.
✅ CosmosDB connectivity restored – Waited for Azure region-wide services to stabilize post-outage.
We deeply regret the impact this incident had on our users and teams. Ensuring seamless service availability is our priority, and we acknowledge that this issue could have been avoided with better Redis failover strategy and multi-region preparedness.
We are committed to preventing similar incidents by implementing Redis Sentinel, region-aware resilience practices, and faster automated recovery. Thank you for your patience and trust as we continue to harden our infrastructure.
Owner: Matus Szepe
We have now fully resolved the incident that began earlier today due to Azure AKS node failures in the northeurope-3 region.
✅ All core services have been restored
✅ Telemetry and alerting pipelines are operational
✅ Azure CosmosDB is now fully functional, and dependent services are behaving as expected
We will proceed with a full post-incident review to analyze the root causes, validate recovery steps, and implement long-term improvements to prevent recurrence.
Thank you for your patience throughout this disruption.
🚧 Incident Update – Recovery in Progress We have mitigated the initial impact from the Azure AKS node failures in the northeurope-3 datacenter, and most services have been restored.
However, we are still experiencing degraded performance and intermittent connectivity issues with Azure CosmosDB, which is affecting some components dependent on database access.
We are monitoring recovery progress closely and will provide updates as we learn more.
Current Status:
✅ Core services operational
⚠️ CosmosDB access partially degraded
🛠 Recovery in progress
Thank you for your patience as we continue to restore full service reliability.
🚨 We are currently investigating a service disruption affecting parts of our EU and Telemetry clusters. Initial signs point to infrastructure issues within the Azure North Europe region. Some services may be intermittently unavailable or degraded. Our team is actively working to identify the scope and restore normal operations as quickly as possible.
We will provide further updates as we learn more.
We’ll find your subscription and send you a link to login to manage your preferences.
We’ve found your existing subscription and have emailed you a secure link to manage your preferences.
We’ll use your email to save your preferences so you can update them later.
Subscribe to other services using the bell icon on the subscribe button on the status page.
You’ll no long receive any status updates from MyQ Roger, are you sure?
{{ error }}
We’ll no longer send you any status updates about MyQ Roger.