Microsoft’s Azure cloud platform is suffering a widespread outage that has affected several websites, services and apps including Microsoft 365, Xbox and NatWest. It has reportedly even halted voting at the Scottish Parliament. Microsoft says a DNS configuration change is to blame and is attempting a rollback while rerouting traffic to healthy infrastructure.The disruption first registered as spikes in outage reports on Downdetector earlier today, and Microsoft’s Azure status page later confirmed problems with Azure Portal access. On its status page, Azure’s network infrastructure was showing as “critical” in every region in the world, underscoring the global scale of the problem.
Microsoft says issue identified
Microsoft said it believed the outage was a result of “an inadvertent configuration change”, and that it planned to remedy the situation by rolling the service back to a recent backup known to be functioning correctly. “We’ve identified a recent configuration change to a portion of Azure infrastructure which we believe is causing the impact. We’re pursuing multiple remediation strategies, including moving traffic away from the impacted infrastructure and blocking the offending change.” Microsoft said that it has halted the rollout anddeploying previous configuration. “We’ve halted the rollout of the impacting configuration change. We’re continuing to route service traffic away from affected infrastructure to recover service availability. In parallel, we’re working to revert the impacted infrastructure to a previous state,” said the update. “We’re deploying a previous healthy configuration to the affected infrastructure to resolve this issue. This is being done in tandem with efforts to rebalance traffic across healthy infrastructure to mitigate impact quickly.” “We’re rerouting affected traffic to alternate healthy infrastructure as a near-term resolution while our investigation into the source of the issue is ongoing,” the company added. “We’ve identified portions of internal infrastructure that are experiencing connectivity issues. We’re unblocking these systems and redistributing traffic to support recovery, as we continue our work to reroute affected traffic to restore service health,” Microsoft said in another update.
Status update on Azure page
We have initiated the deployment of our last known good configuration, which is expected to complete within 30 minutes. As this deployment progresses, customers should begin to see initial signs of recovery. Once completed, we will begin recovering nodes and routing traffic through these healthy nodes.Customer configuration changes will remain temporarily blocked while we continue mitigation efforts. We will notify customers once this block has been lifted.Some customers may also have experienced issues accessing the Azure management portal. We have failed the portal away from AFD to mitigate these access issues. Customers should now be able to access the Azure portal directly, and while most portal extensions are functioning as expected, a small number of endpoints (e.g., Marketplace) may still experience intermittent loading problems.We do not yet have an ETA for full mitigation, but we will provide another update within 30 minutes, once the deployment has completed.Customers may also consider implementing failover strategies using Azure Traffic Manager to redirect traffic from Azure Front Door to their origin servers as an interim measure.
What exactly are DNS problems
Microsoft’s update outlines the mechanics of the issue. It says that the domain name system, or DNS, is the service that translates internet addresses into machine-readable IP addresses that connects browsers and apps with websites and underlying web services. The company warned that DNS errors disrupt this translation process, interrupting the connection, and noted that because so many sites and services run on Microsoft’s cloud, a DNS failure can have far-reaching impact.







