BitDepthFeatured

When the cloud bursts

3 Mins read
  • Estimates of the insurable loss resulting from the outage reached up to US$581 million.
  • AWS leads the market with 32% share, followed by Microsoft Azure (23%) and Google Cloud (13%).
  • AWS began as an internal project for Amazon’s e-commerce business and later became a major cloud service provider.

Above: Photograph by neiezhmakov/DepositPhotos

BitDepth 1535 for November 03, 2025

On October 19, Amazon’s Web Services (AWS) experienced a failure that took down many websites and web services globally.

Soon after resolving the issue some 15 hours later, the company posted a deeply technical explanation of the meltdown.

“The root cause of this issue was a latent race condition in the DynamoDB DNS management system that resulted in an incorrect empty DNS (Domain Name Server) record for the service’s regional endpoint that the automation failed to repair,” Amazon stated.

Two programs competed to write the same DNS entry at the same time, resulting in an empty DNS record for the service’s regional endpoint, which led to the accidental deletion of all IP addresses for the database service’s regional endpoint.

Imagine two students trying to write an answer to the same problem on a whiteboard, each erasing the other’s efforts in the process. Finally, imagine the resulting scramble wiping the whiteboard completely and you have a sense of what happened.

DNS records match the text name of a website address with the numbers of its actual digital address. Without it, websites and services cannot be found.

Reports of the number of vendors affected by the cloud outage at the company’s first data centre in north Virginia varies between 2,000 and 70,000.

Direct customers will be covered by Amazon’s Service Level Agreements (SLAs) through compensatory service credits according to each agreement. These are usually based on a small fraction of the customer’s monthly bill.

Customers of vendors reselling Amazon cloud capacity and services will have to negotiate according to SLAs with their providers.

Estimates of the insurable loss resulting from the outage have run as high as US$581 million over the fifteen hours of downtime.
Why were so many affected by a DNS failure at a single cloud provider?

Amazon is one of three major data hyperscalers, providers of hardware capacity and infrastructure on a global scale that are the backbone of the commercial internet.

AWS commands a leading share of that market at 32 per cent, followed by Microsoft Azure, with 23 per cent and Google Cloud holding 13 per cent.

The logos you never see. Amazon Web Services, Microsoft Azure and Google Cloud Services.

As a business sector, AWS accounts for 18 per cent of the company’s total revenue, but its operating profit is significantly higher than the e-commerce business.

That reliance on a single vendor for internet presence was addressed in posts by Meredith Whittaker, president of Signal on BlueSky.

“The extent of the concentration of power in the hands of a few hyperscalers is way less widely understood than I’d assumed,” Whittaker wrote.

“Which bodes poorly for our ability to craft reality-based strategies capable of contesting this concentration and solving the real problem. This isn’t ‘renting a server.’ It’s leasing access to a whole sprawling, capital-intensive, technically-capable system that must be just as available in Cairo as in Capetown, just as functional in Bangkok as Berlin.”

“Infrastructure like AWS is not something that Signal, or almost anyone else, could afford to just “spin up.” Which is why nearly everyone that manages a real-time service – from Signal, to X, to Palantir, to Mastodon – rely at least in part on services provisioned by these companies.”

On October 29, Azure experienced an eight-hour outage of some of its services, the result of an “internal configuration change.”
Companies requiring resilience at that scale can’t easily implement standard ICT practices like redundant hardware without investing in staggering plant costs and may not have the margins to have a warm standby system on another cloud provider’s service.

Hyperscalers typically operate networks of hundreds of data centers with millions of servers distributed globally. Amazon alone operates 135 of these energy demanding facilities.

Their marketing pitch has always been customer flexibility; buy the capacity needed with the flexibility to scale up or down on demand.

The hyperscaler service model evolved to respond to internal business needs rather than external market demand.

AWS began as an internal project to provide uniform backend services for Amazon’s e-commerce business as it both grew in size and geographic scope.

Having built a world scale data management system, it was a short jump from there to sell the service to other companies, which admittedly required significant persuasion in the early days.

Most companies had their own servers under their control and early efforts to persuade IT managers to outsource their hardware were generally met with skepticism.

AWS was introduced in 2002 and by 2010, Microsoft was offering a competitive service, Azure. Google introduced its App Engine in 2008, but formally entered the cloud service market in 2011.

Ideally, network infrastructure should be distributed and redundant. Technically, hyperscalers tick those boxes, but they also gather shared services in fewer real world locations than the internet’s creators originally envisioned.

Compared to that widely scattered design, today’s hyperscalers bundle an uncomfortable number of eggs in far fewer baskets.

Unfinished symphonies

Unfinished symphonies

The market viability of creative projects often can't be realistically assessed until the work is done.
Read More
Do you know who your child is talking to?

Do you know who your child is talking to?

That gorgeous, soft-spoken Swedish girl who admires your boy-child might a retired Nigerian prince looking for a new revenue stream.
Read More
Windows on a Mac, 2025

Windows on a Mac, 2025

Software virtualisation solutions were a great solution for users who just needed to run one or two apps on Windows that weren't processor intensive.
Read More
An Affinity for Canva

An Affinity for Canva

Professionally oriented software that integrates seamlessly with a consumer grade design tool is next level gamesmanship.
Read More
When the cloud bursts

When the cloud bursts

Hyperscalers typically operate networks of hundreds of data centers with millions of servers distributed globally.
Read More
Encryption, privacy and public safety

Encryption, privacy and public safety

Without encryption, that data can be read, copied or changed in transit. Encryption makes that data unreadable to outsiders
Read More
Big budget for tech, unclear spending strategy

Big budget for tech, unclear spending strategy

ICT is now the single largest line item under economic infrastructure spending in the 2026 PSIP with almost a third of that budget at $400 million.
Read More
Caribbean cryptocurrency concerns

Caribbean cryptocurrency concerns

In a pause with a defined timeline, operators may move outside the jurisdiction or take government to court and hope it drags on.
Read More
Suddenly, 30 years later…

Suddenly, 30 years later…

It’s really difficult to get excited over shiny and new when you’ve seen how quickly that gloss gets tarnished and eventually rots.
Read More
A blanket ban on cryptocurrency is a Luddite’s strategy

A blanket ban on cryptocurrency is a Luddite’s strategy

The government has not made it clear to what extent the new bill is intended to deepen compliance requirements with the FATF.
Read More
The parable of the rake

The parable of the rake

The first school reopening that included rake distribution was, predictably, somewhat chaotic.
Read More
AI and the jobs of the future

AI and the jobs of the future

Of the three broad classes of jobs, making, thinking, and caring, the ones that are likely to survive will be those that are driven by thinking and caring.
Read More
What Barbados’ Banyan acquisition teaches us

What Barbados’ Banyan acquisition teaches us

Our continuing national mistake in art, culture and journalism has been to treat the final product as the only product.
Read More
Is the M4MacMini a workstation?

Is the M4MacMini a workstation?

This computer can't be upgraded after purchase. You have to choose your specs on purchase and live with it
Read More
Jamaica’s digital transformation journey

Jamaica’s digital transformation journey

"Failure to share the vision and mission can lead to misalignment of that business or ministry with the IT plan."
Read More
How USB-C failed us

How USB-C failed us

USB-C cables shipped with smartphones were often cheap and delivered power, but limited or no data transfer at all.
Read More
How AI summaries will break knowledge

How AI summaries will break knowledge

Google has been indexing the collective wisdom of the open internet for the last two-and-a-half decades.
Read More
Drifting to data-driven decisions

Drifting to data-driven decisions

"Many organizations are collecting data, but few are converting it into action."
Read More
What .POST means for secure communications

What .POST means for secure communications

Posts are not just offering digital postal services, they are offering digital services across multiple sectors.
Read More
Samsung launches new Z series Flip, Fold

Samsung launches new Z series Flip, Fold

A foldable phone looks like a standard smartphone when shut and usually has a functional screen on its face.
Read More
Unfinished symphonies Unfinished symphonies
Do you know who your child is talking to? Do you know who your child...
Windows on a Mac, 2025 Windows on a Mac, 2025
An Affinity for Canva An Affinity for Canva
When the cloud bursts When the cloud bursts
Encryption, privacy and public safety Encryption, privacy and public safety
Big budget for tech, unclear spending strategy Big budget for tech, unclear spending...
Caribbean cryptocurrency concerns Caribbean cryptocurrency concerns
Suddenly, 30 years later… Suddenly, 30 years later…
A blanket ban on cryptocurrency is a Luddite’s strategy A blanket ban on cryptocurrency is...
The parable of the rake The parable of the rake
AI and the jobs of the future AI and the jobs of the...
What Barbados’ Banyan acquisition teaches us What Barbados’ Banyan acquisition teaches us
Is the M4MacMini a workstation? Is the M4MacMini a workstation?
Jamaica’s digital transformation journey Jamaica’s digital transformation journey
How USB-C failed us How USB-C failed us
How AI summaries will break knowledge How AI summaries will break knowledge
Drifting to data-driven decisions Drifting to data-driven decisions
What .POST means for secure communications What .POST means for secure communications
Samsung launches new Z series Flip, Fold Samsung launches new Z series Flip,...

🤞 Get connected!

A once weekly email notification of new stories on TechNewsTT. Just that. No spam.

Possible UI Glitch. Click top right corner to dismiss 👉

Get Connected!

A once weekly email notification of new stories on TechNewsTT.

Just that. No spam.

Related posts
BitDepthFeatured

The apps that thrive in Apple's ecosystem

4 Mins read
By Apple’s own yardstick an app that shares usable data across three devices is acceptable one that synchronises with four is a winner.
BitDepthFeatured

Practical steps to reducing cybersecurity risks

4 Mins read
The process, to be effective, must be ongoing and managed to ensure that vendors meet required standards.
BitDepthFeatured

The consequences of careless code

5 Mins read
The cruel reality of Crowdstrike is that it wasn’t a cybersecurity attack. It was a quality of service lapse and the incident puts IT professionals in an odd space.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
trackback
24 days ago

[…] Caribbean – On October 19, Amazon’s Web Services (AWS) experienced a failure that took down many websites and web services globally… more […]

1
0
Share your perspective in the comments!x
()
x