Subgraph APIs Degraded

Resolved
Degraded performance
Started 11 months ago Lasted about 4 hours

Affected

API
Dashboard
Updates
  • Resolved
    Resolved

    The AWS team has fixed the root cause, and we have confirmed stability in our underlying systems.

    As a recap:

    From 11:18 AM to 12:46PM, 3% to 5% of requests errored (5xx) or timed out from our Subgraph APIs during intermittent periods lasting up to 3 minutes at a time. This affected a number of specific endpoints, so certain customers may have seen up to 50% of their requests erroring our timing out.

    After 12:46PM, under 0.01% of requests sent would be affected. Customer impact was greatly reduced at this time. The number of API requests dropped across our whole system dropped to single digits in any 5 minute window, but would still happen.

    By 3:40PM, we no longer saw any errors.

  • Monitoring
    Monitoring

    As of 1:50pm PT, only a small fraction (<0.01%) of requests are failing compared to the peak instability (5%).

    The team is still working with AWS on additional mitigations to fix all failing requests.

  • Identified
    Update

    The issue is still occurring intermittently, specifically for Subgraph APIs.

    Customers using Mirror to push data to their database or goldsky-hosted cross-chain APIs are unaffected and should not see any downtime.

  • Identified
    Identified

    We've identified the AWS service in question and actively working with AWS to fix the issue.

    This issue only affects the query layer on some subgraphs, as well as our dashboard. Indexing was not affected.

  • Investigating
    Investigating

    AWS is going through a rolling outage, which is resulting in some subgraphs getting intermittent timeouts.