How it works

Orbitrage runs as two planes over one regional data tier:

Data plane — the router engine that carries your production LLM traffic.
Control plane — the dashboard and analytics.

Keeping them separate means dashboard queries never slow your LLM calls, and the LLM path stays thin.

The pieces

SDK

A ~5-line wrapper (orbitrage) that points your OpenAI-compatible client at the engine and tags every request with a trace id. No background threads, no span exporters.

Router engine

The public, OpenAI-compatible edge at api.orbitrage.ai/v1. It authenticates your key, gates on credits, optionally swaps in your BYOK provider key, scores the prompt and selects a tier + model, then proxies to the provider with a queue, per-provider concurrency caps, and circuit breakers — and records one telemetry row off the hot path. One service, no extra hop.

Dashboard + Intelligence

The control plane: multi-level analytics, the Intelligence layer (anomalies + trajectories), the Ask Analytics assistant, and account/billing.

Data tier

Postgres with row-level security for per-org isolation — the canonical store the engine writes and the dashboard reads.

A request, step by step

Authenticate

The engine resolves your orb_ key (SHA-256 prefix lookup, cached ~5 min) to a user, org, and workflow. Invalid or revoked keys get 401.

Gate on credits

The org’s balance is checked against a short-TTL cache. If credits are exhausted, the call returns 402 before any provider is touched.

Resolve BYOK

If a saved, enabled provider key matches the requested model, the engine decrypts it (AES-256-GCM) and forwards the call to the real provider endpoint, so your provider account is billed and Orbitrage charges $0. Closed-weight frontier models (claude-*, gpt-*, gemini-*, grok-*) are BYOK-only — with no enabled key the call stops here with 403 byok_key_required rather than falling back to pooled credits. See BYOK.

Route

The engine scores the prompt, applies any capability ceiling, adjusts tier thresholds by the operator dial, picks the cheapest capable model, and proxies — with a fallback chain if the primary fails.

Stream + record

The response streams straight back to your SDK as clean OpenAI frames. The routing decision, token counts, latencies, and cost are written as one row to your data store — fire-and-forget, off the hot path, so your latency isn’t taxed by the write.

One source of truth

Earlier SDK versions exported OpenTelemetry spans to a separate ingest endpoint. The current architecture is simpler: the engine is the single source of truth. Every request terminates at the engine, so there’s nothing extra to export — the engine writes the canonical routing_steps record itself. (Legacy OTLP span endpoints now return 410 Gone.)

This is why the SDK is so thin and dependency-free: it doesn’t collect or ship telemetry. It only points your client at the engine and adds trace headers. See Observability.

Infrastructure

The engine and the dashboard run in the same Azure Container Apps environment in East US 2, sharing one regional data tier. The engine is the only public data-plane surface; provider calls run in-region, so the only network legs are client → engine and engine → provider.

Component	Exposure	Role
Router engine	Public (`api.orbitrage.ai`)	Auth, credit gate, BYOK, scoring, routing, provider fan-out, telemetry
Dashboard	Public (`app.orbitrage.ai`)	Analytics, Intelligence, account
Data tier	Private	Postgres + RLS, per-org isolation

Multi-tenant isolation

Every telemetry row is stamped with the org_id resolved from your API key, and row-level security ensures one org can never read another’s data. The same boundary holds across the dashboard, the Ask Analytics assistant (which pins org_id server-side — the model never sees it), and the MCP server.

Get Started

Core Concepts

SDKs

Integrations

Examples

Dashboard

Platform

Account & Billing

The pieces

SDK

Router engine

Dashboard + Intelligence

Data tier

A request, step by step

One source of truth

Infrastructure

Multi-tenant isolation

​The pieces

SDK

Router engine

Dashboard + Intelligence

Data tier

​A request, step by step

​One source of truth

​Infrastructure

​Multi-tenant isolation

The pieces

A request, step by step

One source of truth

Infrastructure

Multi-tenant isolation