
Unraveling the Reasons Behind the Major Outage in Microsoft 365, Teams & Outlook: An In-Depth Analysis by ZDNet

Unraveling the Reasons Behind the Major Outage in Microsoft 365, Teams & Outlook: An In-Depth Analysis by ZDNet
Image: Getty Images
Microsoft says an update on a router was behind a huge multi-hour outage affecting the Microsoft Wide Area Network (WAN) that made Azure, Microsoft 365 apps, and Power Platform inaccessible to customers across the globe last week.
The multi-hour outage last Wednesday impacted Microsoft Teams, Exchange Online, Outlook, SharePoint Online, OneDrive for Business, Microsoft Graph, PowerBi, M365 Admin Portal, Microsoft Intune, Microsoft Defender for Cloud Apps, and Microsoft Defender for Identity.
Cloud
- What is digital transformation? Everything you need to know
- The best cloud providers compared: AWS, Azure, Google Cloud, and more
- The top 6 cheap web hosting services: Find an affordable option
- What is cloud computing? Here’s everything you need to know
Prior to the outage, Microsoft had warned customers that a planned update might cause latency or timeouts from 07:05 UTC on Wednesday when customers attempted to connect to Azure resources in Public Azure regions, Microsoft 365, and Power BI. But as workers in Europe started the day, the update caused more than latency issues and started impacting network devices across the Microsoft WAN, which dropped connections between services in data centers as well as connections on ExpressRoute, Microsoft’s private network for customers to transfer data between data centers.
Microsoft says in its preliminary post-incident review that most regions and services had recovered by 09:00 UTC on Wednesday, but they were not fully recovered until 12:43 UTC on 25 January. The outage also affected Azure Government cloud services that were dependent on Azure public cloud, according to Microsoft.
Also: The best cloud storage services: Are free ones worth it?
“We determined that a change made to the Microsoft Wide Area Network (WAN) impacted connectivity between clients on the internet to Azure, connectivity across regions, as well as cross-premises connectivity via ExpressRoute,” Microsoft says in its report first spotted by Bleeping Computer .
“As part of a planned change to update the IP address on a WAN router, a command given to the router caused it to send messages to all other routers in the WAN, which resulted in all of them recomputing their adjacency and forwarding tables. During this re-computation process, the routers were unable to correctly forward packets traversing them. The command that caused the issue has different behaviors on different network devices, and the command had not been vetted using our full qualification process on the router on which it was executed.”
Microsoft’s monitoring systems detected domain name service (DNS) and WAN issues at 07:12 UTC. After reviewing recent changes, while automatic recovery was happening at 08:20 UTC, engineers discovered the “problematic command” behind the issues.
Also: Technology spending will rise in 2023. And this old favorite is still a top priority
“Due to the WAN impact, our automated systems for maintaining the health of the WAN were paused, including the systems for identifying and removing unhealthy devices, and the traffic engineering system for optimizing the flow of data across the network,” Microsoft said.
“Due to the pause in these systems, some paths in the network experienced increased packet loss from 09:35 UTC until those systems were manually restarted, restoring the WAN to optimal operating conditions. This recovery was completed at 12:43 UTC.”
Microsoft says it has now “blocked highly impactful commands from getting executed on the devices” to mitigate future occurrences. It’s also now requiring all command execution on the networks devices to follow safe change guidelines.
Microsoft plans to publish a final post-incident report within the next two weeks.
Featured
How to disable ACR (and greatly reduce ads) on every TV model - and why you should
I replaced my Samsung Galaxy S24 Ultra with the Pixel 9 Pro XL for two weeks - and can’t go back
Linus Torvalds talks AI, Rust adoption, and why the Linux kernel is ‘the only thing that matters’
The best mini PCs you can buy: Expert recommended
- How to disable ACR (and greatly reduce ads) on every TV model - and why you should
- I replaced my Samsung Galaxy S24 Ultra with the Pixel 9 Pro XL for two weeks - and can’t go back
- Linus Torvalds talks AI, Rust adoption, and why the Linux kernel is ‘the only thing that matters’
- The best mini PCs you can buy: Expert recommended
Also read:
- [New] From Beginner to Star Top 25 Vlogging Ideas
- [Updated] In 2024, Enhancing Visibility Popularize Your YouTube Short Videos
- [Updated] In 2024, Essential YouTube SEO Techniques for Enhanced Video Rankings
- [Updated] Unlocking 10+ Top Free Subtitle Converter Websites
- Can’t view HEVC H.265 content on Xiaomi 13 Ultra
- Expand Your Digital Space: Apple Introduces New 6TB & 12TB iCloud+ Plans for Just $30-$60/Month – Discover More on ZDNet
- Explore How Database Operating Systems Thrive in the Cloud Unlike Linux – Get Your Free Trial Today!
- Exploring AWS's Latest Industry Moves Unveiled at the Re:Invent 2021 Conference | ZDNet Analysis
- In 2024, How to Change Netflix Location to Get More Country Version On Vivo Y200 | Dr.fone
- Launching SBOM Capabilities with Codenotary for Enhanced Kubernetes Security - Insights on ZDNet
- Next-Gen AI Integration Boosts Google Cloud's Developer Resources & Analytical Power, According to ZDNet
- Quick and Simple Steps to Set Up Your Own Cloud Service at Home Within One Hour - Insights From ZDNet
- Savory Streamers The Crème De La Cuisine
- The Critical Role of Active Oversight in Cloud Security Management
- Understanding the Cause Behind the Major Microsoft 365, Teams & Outlook Disruption: An Insightful Breakdown
- Unlock the World of VR with Your Phone in Minutes for 2024
- VivaVideo Review The User's Perspective for 2024
- Title: Unraveling the Reasons Behind the Major Outage in Microsoft 365, Teams & Outlook: An In-Depth Analysis by ZDNet
- Author: Donald
- Created at : 2025-01-01 22:08:19
- Updated at : 2025-01-05 20:56:10
- Link: https://some-tips.techidaily.com/unraveling-the-reasons-behind-the-major-outage-in-microsoft-365-teams-and-outlook-an-in-depth-analysis-by-zdnet/
- License: This work is licensed under CC BY-NC-SA 4.0.