mood Posted March 15, 2021 Share Posted March 15, 2021 Microsoft 365 outage knocks down Teams, Exchange Online An Azure Active Directory outage is preventing users from logging into Microsoft 365, Microsoft Teams, Exchange Online, Forms, Xbox Live, and Yammer. Starting at approximately 3:34 PM EST, users began reporting being unable to login to their Microsoft 365 accounts, Microsoft Teams, or access other Microsoft apps. It appears @Microsoft365 is having a few issues at the moment; currently unable to access Forms. Hopefully back up soon for my @syscouts quiz on tree recognition — James Garnett (@jamesmgarnett) March 15, 2021 MicrosoftTeams is down? not able to connect to any meeting MSFT365Status https://t.co/GSa7qv1IZE #MicrosoftTeams #Microsoft #Microsoft365 Translated using #MicrosoftFlow — Daniel Villamizar -Microsoft Azure MVP (@CSA_DVillamizar) March 15, 2021 The outage is also affecting Microsoft sites, such as the Tech Community web site, as users are not able to log into the site. "As a result of the issues currently facing Azure AAD, we are currently experiencing problems on the Microsoft Tech Community with login and authentication. This will result in users being unable to login and users already logged in getting unexpected errors as sessions timeout," posted a Microsoft Tech Community manager. Microsoft has acknowledged the outage in the Microsoft 365 incident report MO244568, which states the outage initially impacted Microsoft Teams but is now affecting other services. "Initial reports indicate that primary impact is to Microsoft Teams; however, other services including Exchange Online and Yammer are also impacted." "We're investigating a potential issue and checking for impact to your organization. We'll provide an update within 30 minutes," the outage report states. Microsoft has confirmed that the widespread outages affecting Microsoft's online services are the result of an Azure Active Directory (AAD) configuration issue. This issue is preventing users from authenticating to Microsoft 365, Exchange, Online, Microsoft Teams, or any other service relying on AAD. "Starting at approximately 19:15 UTC on 15 Mar 2021, a subset of customers may experience issues authenticating into Microsoft services, including Microsoft Teams, Office and/or Dynamics, Xbox Live, and the Azure Portal," reads the Azure status page. Microsoft is sharing updates on their Microsoft 365 Status Twitter account, with the list of status updates shared below: 3/15/21 3:40 PM EST: "We're investigating an issue for access to multiple M365 services. Please visit the admin center post M0244568 for more information. We'll provide additional information here as it becomes available." 3/15/21 4:04 PM EST: "We've confirmed that this issue could be affecting users worldwide. Additional information can be found at http://status.office.com, or if available, under MO244568 in the admin center." 3/15/21 4:11 PM EST: "We've identified an issue with a recent change to an authentication system. We’re rolling back the update to mitigate impact, which we expect will take approximately 15 minutes. Additional information can be found at http://status.office.com or under MO244568 if available." 3/15/21 4:44 PM EST: "The process to roll back the change is taking longer than expected. We'll provide an ETA as soon as one becomes available. Additional information can be found at http://status.office.com or under MO244568 if available." 3/15/21 5:01 PM EST: "We've identified the underlying cause of the problem and are taking steps to mitigate impact. We'll provide an updated ETA on resolution as soon as one is available. Additional information can be found at https://status.office.com or under MO244568 if available." 3/15/21 5:17 PM EST: "We are currently rolling out a mitigation worldwide. Customers should begin seeing recovery at this time, and we anticipate full remediation within 60 minutes. Additional information can be found at http://status.office.com or under MO244568 if available." Update 3/15/21 4:48 PM: The outage is confirmed to be caused by an Azure Active Directory issue. Article updated with this information. This is a developing story. Source: Microsoft 365 outage knocks down Teams, Exchange Online Link to comment Share on other sites More sharing options...
mood Posted March 17, 2021 Author Share Posted March 17, 2021 Microsoft blames crypto key rotation snafu for 365 outage Teams, Exchange Online, and other services were knocked offline for more than 14 hours Microsoft has blamed a key rotation issue for a large-scale 365 outage that affected many of its services on Monday and Tuesday. The outage – which took down Teams, Exchange Online, and other 365 services – kicked in at around 19:00 UTC on Monday and was only resolved more than 14 hours later, at around 09:25 on Tuesday. Problems in the periodic rotation of cryptographic keys caused authentication checks to fail for any application that relied on Azure Active Directory, causing problems that persisted overnight until engineers were able to apply a fix. In a status update, Microsoft explained that the authentication problems arose because a key marked for retention had erroneously been deleted by the system. This caused particular problems because the key was needed to manage a migration project, as the company explained: The preliminary analysis of this incident shows that an error occurred in the rotation of keys used to support Azure AD’s use of OpenID and other identity standard protocols for cryptographic signing operations. As part of standard security hygiene, an automated system on a time-based schedule removes keys that are no longer in use. Over the last few weeks, a particular key was marked as “retain” for longer than normal to support a complex cross-cloud migration. This exposed a bug where the automation incorrectly ignored that “retain” state, leading it to remove that particular key. Azure Admin Portal, Teams, Exchange, Azure KeyVault, SharePoint, and Storage were all effected to a lesser or greater extent by the problem. Growing pains Security vendor Venafi warned that outages of this nature are likely to become more common as digital transformation accelerates, thus heightening the importance of key rotation. Michael Thelander, director of machine identity strategy at Venafi, commented: “Poorly orchestrated key rotation is the Achilles heel of modern digital transformation efforts; this oversight is capable of bringing down entire applications and services in an instant. “Keys and certificates have numerous ‘states’ that guide their automation and orchestration processes. They also have hard-coded expirations. “‘Retain’ is a tag that tells the system, ‘This key may be retired or expired, but the system needs to keep it to enable any overlap between dynamic processes’. “If the ‘retain’ tag is overlooked and the keys are deleted before replacements are ready – and this all happens in microseconds – systems fail,” he added. Thelander concluded: “Unfortunately, these kinds of outages will only continue until organizations adopt an enterprise-wide approach to managing the machine identities these keys and certificates represent. “Digital transformation is not going to slow down, and this requires automation of keys and certificates found in workloads, containers, and across cloud environments as well as those in on-prem environments.” Source: Microsoft blames crypto key rotation snafu for 365 outage Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.