Goldsky - Elevated latency on Edge RPC – Incident details

Elevated latency on Edge RPC

Resolved
Major outage
Started about 18 hours agoLasted about 11 hours

Affected

Edge

Major outage from 7:00 PM to 6:00 AM

Abstract

Major outage from 7:00 PM to 6:00 AM

Arbitrum One

Major outage from 7:00 PM to 6:00 AM

Arbitrum Sepolia

Major outage from 7:00 PM to 6:00 AM

Avalanche

Major outage from 7:00 PM to 6:00 AM

Base

Major outage from 7:00 PM to 6:00 AM

Updates
  • Resolved
    Resolved

    We've deployed configuration mitigations that significantly reduce auth-DB load and have a follow-up fix in flight for the underlying connector behavior. The fleet is currently stable and we are continuing to monitor.

  • Investigating
    Investigating

    Between approximately May 12th 20:00 UTC and May 13th 07:00 UTC today, customers in ams, cdg, iad, and sjc may have seen elevated latency and intermittent failures on Edge RPC requests. The root cause was a reconnect cascade between our edge fleet and the auth database, triggered by a brief upstream blip and amplified by a latent bug in our connection-handling code. Traffic was largely served throughout via our auth fail-open path; some requests timed out at peak.