Azure outage disconnects thousands from Outlook and Teams around the world
Azure, the Microsoft cloud that delivers business-critical services like Outlook and Teams, suffered a significant outage on January 25th.
While the company has not released even vague numbers for connections that went down, but the outage was reported in the Americas, Europe, Asia Pacific, the Middle East and Africa. The UK alone reported more than 5,000 cases of outage, and both India and Japan also reported thousands of instances.
The outage was initially reported to Downdetector, the website that logs and monitors events when major websites or servers go down, and the number of cases rose relatively rapidly.
Microsoft eventually announced its belief that a network issue – in particular, a connectivity issue with the Microsoft Wide Area Network (WAN), that connects Microsoft clients to the Azure servers.
However many instances of outage there actually were, the one thing that can be said with cast iron confidence is that the situation could have been significantly worse – Azure has 15 million corporate clients around the world, and over 500 million active clients. The point of which is that an outage of the connectivity between Azure and its clients can cause a domino effect within individual businesses and supply chains, leaving businesses paralyzed and their standard communication channels defunct for as long as the outage goes on.
Ironically, in Microsoft-dependent businesses, Teams often works as a backup comms system in the event that anything goes wrong with Outlook. But obviously, something as simple as a WAN failure can knock out both connections to Azure simultaneously.
Such a failure can affect – and seems to have affected – the connectivity between data centers and Azure, as part of that domino effect that a seemingly simple outage can cause. That’s a much bigger concern, because data centers can be absolutely business-critical, and while outage is less serious than, for instance, data theft, because the connectivity issue will always eventually be resolved, the amount of time the disconnection lasts can significantly impact those businesses.
Rollin’, rollin’, rollin’.
Microsoft subsequently announced a rollback of a network change that it says was responsible for the outage, and Downdetector acknowledged that the numbers of outage reports had turned from a flood to a trickle.
One fact that’s at least probably irrelevant is that China is the only main market which reported no outages in the January incident. It’s probably irrelevant because a Microsoft rollback seems to have eventually fixed the issue, suggesting a simple WAN connectivity issue, rather than anything with geopolitical implications.
But, while the Azure disconnection issue will be incredibly unlikely to make businesses re-think their policies on going cloud-native, it serves as an example of just one of the things that can happen if your entire business is dependent on a single cloud infrastructure, and has no cloud-free infrastructure to use in times of need.
Ironically perhaps, Azure boosted Microsoft’s fiscal second-quarter earnings just a day before the outage. Despite the market fear that cloud spending might well be significantly cut in the next fiscal year as clients look to find savings in their budgets to help ride out an uncertain and unfriendly economic climate, Azure – part of Microsoft’s intelligent cloud business – was valued as being worth $22 billion, a welcome shot in the economic arm as Microsoft rode the news cycle caused by its announcement of plans to ace 10,000 jobs worldwide this year.
Azure is one of the leading players in the cloud market, and it increased its share of that market to 30% across 2022. Given that researchers predict that, potential belt-tightening notwithstanding, companies around the world are likely to grow their spending on cloud services by 20% in 2023 to around $590 billion, that 30% in the hands of a single operator amounts to a crucial sector of Microsoft’s ongoing profitability.
Not the only vulnerable cloud.
And while the January outage was a serious event with potentially significant consequences for businesses and data centers, it’s worth remembering that Microsoft is in no sense alone in suffering an outage on its cloud servers.
There were at least 15 major outages by leading cloud service players across the course of 2022, caused by everything from configuration errors to deliberate physical attacks. Most analysts in the cloud computing space are entirely open about the fact that cloud services are more open to outage than non-cloud services, the outages and their impacts tend to be lesser than when non-cloud services are interrupted, because the types of event that affect non-cloud services tend to be of a higher order of significance and potential permanence.
That said, most of 2022’s outages were relatively short in duration – between 2-5 hours – which means the January 2023 might well get to stand out from the crowd simply for the response lag between mass reporting of the issue and the solution quashing the majority of the subsequent connectivity issues.
The degree to which Microsoft will feel the impact of the outage probably depends on the kind of year the other major cloud players have across the rest of 2023. Watch this space.