- Data plane — carries your production LLM traffic.
- Control plane — serves the dashboard and analytics.
The pieces
SDK
A ~5-line wrapper (
orbitrage) that points your OpenAI-compatible client at
the gateway and tags every request with a trace id. No background threads,
no span exporters.Gateway
The public, OpenAI-compatible edge at
api.orbitrage.ai/v1. Authenticates
your key, gates on credits, optionally swaps in your BYOK provider key,
forwards to the engine, and records one telemetry row.Router engine
Private service that scores the prompt, selects a tier and model, and proxies
to the provider with a queue, per-provider concurrency caps, and circuit
breakers. Never reachable from the public internet.
Dashboard + Intelligence
The control plane: multi-level analytics, the Intelligence layer (anomalies +
trajectories), the Ask Analytics assistant, and account/billing.
A request, step by step
Authenticate
The gateway resolves your
orb_ key (SHA-256 prefix lookup, cached ~5 min)
to a user, org, and workflow. Invalid or revoked keys get 401.Gate on credits
The org’s balance is checked against a short-TTL cache. If credits are
exhausted, the call returns
402 before any provider is touched.Resolve BYOK
If a saved provider key matches the requested model, the gateway decrypts it
(AES-256-GCM) and forwards it, so your provider account is billed instead of
pooled credits.
Route
The engine scores the prompt, applies any capability ceiling, adjusts tier
thresholds by the operator dial, picks the cheapest capable model, and proxies
— with a fallback chain if the primary fails.
Two sources of truth, unified
Earlier SDK versions exported OpenTelemetry spans to a separate ingest endpoint. The current architecture is simpler: the proxy is the single source of truth. Every byte of every request crosses the gateway, so there’s nothing extra to export — the gateway writes the canonicalrouting_steps record itself. (Legacy
OTLP span endpoints now return 410 Gone.)
This is why the SDK is so thin and dependency-free: it doesn’t collect or ship
telemetry. It only points your client at the gateway and adds trace headers.
See Observability.
Infrastructure
The gateway and engine run in the same Azure Container Apps environment in East US 2, so the gateway → engine hop stays inside the VNet (~1–5 ms) with no public-internet round-trip. The engine has internal ingress only and verifies a signed edge header, so it can’t be called directly to bypass auth or billing.| Component | Exposure | Role |
|---|---|---|
| Gateway | Public (api.orbitrage.ai) | Auth, credit gate, BYOK, telemetry |
| Router engine | Private (VNet only) | Scoring, routing, provider fan-out |
| Dashboard | Public (app.orbitrage.ai) | Analytics, Intelligence, account |
| Data tier | Private | Postgres + RLS, per-org isolation |
Multi-tenant isolation
Every telemetry row is stamped with theorg_id resolved from your API key, and
row-level security ensures one org can never read another’s data. The same
boundary holds across the dashboard, the Ask Analytics assistant (which pins
org_id server-side — the model never sees it), and the MCP server.