Jump to content
  • Here's Why a Vital Amazon Web Services Region Went Down on Dec. 7

    aum

    • 984 views
    • 3 minutes
     Share


    • 984 views
    • 3 minutes

    Amazon shares the results of its investigation into the 'service disruption' that AWS (and its customers) experienced on Dec. 7.


    Amazon has explained why a vital Amazon Web Services (AWS) region, US-East-1, experienced what the company describes as a "service disruption" for about seven hours on Dec. 7.


    The problems with US-East-1 affected many people's ability to connect to streaming platforms like Netflix, Disney+, and Amazon Prime Video; games like Valorant, League of Legends, and PUBG; apps like Tinder, Venmo, and Coinbase; and many other services that rely on AWS.


    The sheer popularity of those services makes it relatively easy to tell when AWS is having problems—just try to stream a video, play a game, or use a mobile app connected to the nigh-ubiquitous platform. But it can be much more difficult to figure out why AWS is down.


    Here's what Amazon says caused US-East-1's woes:


    "At 7:30 AM PST, an automated activity to scale capacity of one of the AWS services hosted in the main AWS network triggered an unexpected behavior from a large number of clients inside the internal network. This resulted in a large surge of connection activity that overwhelmed the networking devices between the internal network and the main AWS network, resulting in delays for communication between these networks. These delays increased latency and errors for services communicating between these networks, resulting in even more connection attempts and retries. This led to persistent congestion and performance issues on the devices connecting the two networks."

     

    The company also says that congestion "immediately impacted the availability of real-time monitoring data for our internal operations teams, which impaired their ability to find the source of congestion and resolve it," as well as their ability to explain the issue to AWS customers.

     

    AWS is a sprawling platform that offers a broad range of products used by many companies to serve a variety of purposes. It's a wonder that it doesn't experience major outages more often—and that it was able to recover from this particular disruption as quickly as it did.

     

    However, the incident still highlights the inherent risk associated with so many companies relying on AWS, especially since the nature of the network means that problems with the platform can hinder efforts to solve problems with the platform. (And that's when a single region's involved!)

     

    Amazon even acknowledges that relying too much on just one AWS region can be a problem:

     

    "Our Support Contact Center also relies on the internal AWS network, so the ability to create support cases was impacted from 7:33 AM until 2:25 PM PST. We have been working on several enhancements to our Support Services to ensure we can more reliably and quickly communicate with customers during operational issues. We expect to release a new version of our Service Health Dashboard early next year that will make it easier to understand service impact and a new support system architecture that actively runs across multiple AWS regions to ensure we do not have delays in communicating with customers."

     

    More information about what caused the disruption to US-East-1, how Amazon's responding to the issue, and which services were affected can be found in the company's summary.

     

    Source


    User Feedback

    Recommended Comments

    There are no comments to display.



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...