What you need to know
In the early hours of Monday, October 20, a significant disruption swept across the internet, primarily attributed to an outage at Amazon Web Services (AWS). This incident affected a multitude of platforms and services that rely on AWS, including popular applications such as Reddit, Fortnite, Snapchat, Canva, and Apple TV. Reports indicate that even government services around the globe were not immune to the fallout.
The issues began surfacing around midnight ET, with the outage reaching its peak around 3 AM ET. Data from Downdetector revealed that over 13,000 users experienced disruptions between 4 AM and noon ET. This incident marks one of the most substantial internet outages since the CrowdStrike incident last year, which had similarly far-reaching effects, impacting banks and airports alike.
According to various sources, the root cause of the outage was traced to a flaw in an internal system responsible for monitoring the health of network load balancers within AWS’s EC2 network. This flaw resulted in increased error rates and latency, leading to API errors across numerous AWS services.
The widespread impact of this outage has ignited discussions about the heavy reliance on a single cloud provider for essential internet functions. It underscores the vulnerabilities inherent in depending on one company for critical operations and highlights the urgent need for businesses to consider adopting more robust multi-region or multi-cloud strategies.
In a pointed remark, Elon Musk, CEO of X, took the opportunity to critique AWS, stating, “Messages on X chat are fully encrypted with no advertising hooks or strange AWS dependencies.” This comment reflects a growing sentiment among industry leaders regarding the necessity for diversification in cloud service providers to mitigate risks associated with such outages.