Skip to content

Cosmos: Per-Partition Circuit Breaker should track 403:3 (WriteForbidden) errors #46328

@tvaron3

Description

@tvaron3

Description

The per-partition circuit breaker (PPCB) should track 403:3 (WriteForbidden) errors. Currently, PPCB records failures generically without considering 403 status code with sub-status 3, which indicates a write-region failover scenario in single-writer accounts.

Context

  • 403:3 (SubStatusCodes.WRITE_FORBIDDEN) signals that writes are forbidden on the current region, typically during a write-region failover.
  • The existing endpoint discovery retry policy already handles 403:3 for region failover (_endpoint_discovery_retry_policy.py L100), but PPCB does not currently factor this error into its partition health tracking.
  • Incorporating 403:3 into PPCB would allow the circuit breaker to more accurately reflect partition health during region transitions and avoid routing requests to regions that are returning write-forbidden errors.

Expected Behavior

PPCB should recognize 403:3 as a trackable failure so that partition health state is updated accordingly when a region begins returning WriteForbidden errors. This would allow PPCB to proactively route requests away from affected regions/partitions during failover scenarios.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ClientThis issue points to a problem in the data-plane of the library.Cosmos

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions