How it works
Cross-org detection
The engine watches upstream outcomes per provider. It declares an outage only when,
inside a rolling window, failures and the number of distinct affected orgs and
the failure rate all cross their thresholds — so a 10-second blip or one bad prompt
never trips it.
Substitute to a healthy model
Affected requests reroute to the best healthy model of the same (or higher) tier from a
different provider. The substitute is billed at its own price.
Policy
Set your policy on the Outage Shield page:| Setting | Default | Meaning |
|---|---|---|
on_outage | substitute | substitute reroutes to a healthy model; fail surfaces the provider error (for workloads that must have the exact pinned model). |
| BYOK → pooled | off | Rerouting a BYOK call to pooled spends your Orbitrage credits, so it’s opt-in. Off = a BYOK call fails during its provider’s outage. |
| Slack alerts | on | Notify your workspace on trip + recovery. |
x-orbitrage-outage-override: fail header forces a hard-fail for one call,
regardless of policy.
Transparency
Substitution is never silent. Every rerouted call is flagged:| Header | Value |
|---|---|
X-ScaleASAP-Failover | 1 when the call was rerouted. |
X-ScaleASAP-Failover-From | The provider we rerouted away from. |
failover_triggered, failover_from_provider,
and failover_substituted_model, and active provider incidents show on the Outage Shield
page. The originally-requested model is preserved as primary_model_attempted.