CRITICAL·ai-built

Amazon.com — 6-hour outage from AI-assisted deploy

An AI-assisted code deploy at Amazon triggered a regression that took Amazon.com offline for approximately six hours. An estimated 6.3 million orders were lost during the window.

Victim: Amazon

What happened

March 2026 public incident. A code change suggested by an AI coding tool and approved by a human reviewer was deployed to production; the change introduced a regression that affected a core request-handling path on Amazon.com. Rollback took approximately six hours due to downstream cache-warming requirements.

Timeline

  1. Deploy goes out with AI-suggested change.

  2. Error-rate alerts fire; amazon.com begins degrading.

  3. Rollback initiated; cache-warming required before full restoration.

  4. Service fully restored after six-hour degradation.

Root cause

The AI-suggested change introduced a subtle regression in request routing that was not caught by pre-merge tests. Canary rollout detected the issue but rollback took longer than the automated-rollback window allowed.

Impact

  • ~6 hours of amazon.com degradation
  • Estimated 6.3 million orders lost
  • Reputation hit; category-level incident in the AI-code-review conversation
Would Securie have caught it?

Possibly, depending on the nature of the regression. Securie's intent-graph validation flags changes that alter declared module intents — if the AI-suggested change modified the purpose of a request-handling module, Securie would hold the merge until explicit approval.

Lessons

  • Canary rollout + fast rollback is a must for any AI-influenced deploy
  • Pre-merge test coverage for AI-suggested changes should be HIGHER than for human changes, not lower
  • Intent-level review catches what unit tests miss

References