INC-2419Checkout latency & 5xx errors · resolved
IncidentsINC-2419
SEV 1 · CRITICAL ResolvedINC-2419

Elevated checkout latency & 5xx errors

Opened 14:18 UTC · May 9Resolved 14:42 UTC · May 9Duration 24m
Users impacted
~3,400
Carts
1,240 affected · 318 abandoned
Revenue
$18.4k at risk · $4.2k confirmed loss
Regions
us-east-1 · us-east-2
AI ROOT-CAUSE ANALYSIScompleted in 2.4s · 5 signals correlated94%confidence

An N+1 query in finalizeOrder() exhausted the orders-db connection pool, cascading into checkout-svc 5xx and Lambda retries.

Deploy correlation
web-api@7d3f2c1 deployed 14:12; first 5xx burst 14:18 (Δ 6m).
STRONG
Code diff
PR #2847 introduced loop over line_items without preloading; .map(async …) creates per-row queries.
STRONG
DB telemetry
orders.line_items query rate 18 qps → 612 qps. Pool wait time p99 6ms → 2.1s.
STRONG
Customer impact
Drop in /v2/checkout/finalize 2xx coincides exactly with deploy window.
Auto-applied: Rolled back web-api → a91be4f at 14:34 UTC. Forward fix tracked in PR #2853 — preload line_items via JOIN; add pool-saturation alert at 70%.
Affected services
checkout-svcpayments · us-east-1910 rpm
orders-dbdata · us-east-1Postgres 15
web-apiplatform · us-east-12.4k rpm
billing-workerpayments · us-east-1Lambda
Latency during incident
checkout p95 · mspeak 4,820
14:0014:18 ↑ deploy14:34 ↓ rollback15:00