How a disruption in AWS’s US-East region triggered global downtime, and what every business must learn about cloud dependency, resilience and incident response.
AWS outages in the US-East-1 region have created widespread disruptions seen on a global scale. More than 1,000 companies worldwide were affected, according to Downdetector. AWS is a platform that acts as a backbone for hosting websites and applications for major companies across the world, and when a significant failure occurred on October 20th, the portion of the internet depending on AWS as its sole provider experienced outages, and several companies experienced wide scale downtime.
Global services and features depending on US-EAST-1: AWS mentions that features outside US-EAST-1 (global tables, IAM updates, etc.) may still rely on US-EAST-1 endpoints. So even if a client is in another region, if it uses a global feature that routes through US-EAST-1, it might fail. This underscores how a regional fault can have global consequences.
What makes this outage stand out is the sheer scale of the ripple effect: because so many companies host critical infrastructure (websites, databases, game-services, mobile backends, APIs) on AWS in that region, a fault in one part of its stack immediately cascaded into hundreds of services being slow, failing, or completely unavailable.
Because AWS provides both underlying compute & storage (via EC2, S3, DynamoDB etc.) and ancillary services (identity, messaging, databases, global APIs), when one of its core services falters, many dependent systems falter together.
In the financial sector, we are seeing applications such as ICE Mortgage Technology Encompass having a total outage. Trading platforms such as Robinhood and Coinbase reported service issues.
For gaming and social, platforms such as Snapchat, Roblox, and Fortnite are reporting mass outages. Disney+, Reddit, and other major platforms are experiencing issues.
In the productivity space, Canva is experiencing issues. Ring experienced a total outage. Amazon Alexa services were reported down, or not functioning. Signal reported outages. Asana, Jira, O365, and Flickr also reported service issues. Zillow is having a total outage.
According to AWS’s public status updates, the incident was characterized by “increased error rates and latencies for multiple AWS Services in the US-EAST-1 Region.”
"We have identified a potential root cause for error rates for the DynamoDB APIs in the US-EAST-1 Region. Based on our investigation, the issue appears to be related to DNS resolution of the DynamoDB API endpoint in US-EAST-1. We are working on multiple parallel paths to accelerate recovery. This issue also affects other AWS Services in the US-EAST-1 Region. Global services or features that rely on US-EAST-1 endpoints such as IAM updates and DynamoDB Global tables may also be experiencing issues. During this time, customers may be unable to create or update Support Cases. We recommend customers continue to retry any failed requests. We will continue to provide updates as we have more information to share, or by 2:45 AM."
The AWS update strongly suggests they believe the initial fault was in DNS resolution for a critical API endpoint. As we always say in the IT space, "it IS always DNS"...
TLDR; US-EAST-1 (Northern Virgina) reported a failure affecting companies globally on the AWS platform such as Venmo, Signal, Fortnite, and Snapchat.
Global