Base URL:
https://api.orbitrage.ai/v1 · Auth: Authorization: Bearer orb_…model field: pass auto to route, or a concrete id to pin.
Request
auto (or router / default / orbitrage) to let Orbitrage pick the
cheapest capable model, or any model id (e.g. claude-sonnet-4-6, gpt-5.4,
DeepSeek-V4-Flash) to pin one. See Models.The conversation, in OpenAI format (
role + content). Supports text, image
content blocks (for vision models), and tool/assistant messages.Stream the response as server-sent events.
Tool/function definitions, OpenAI format. Routed to a tool-capable model.
Standard sampling controls (
temperature, top_p, max_tokens, etc.) pass
through to the provider.Example
Response
A standard OpenAIchat.completion object. The model that actually served the
request is in the model field; the routing decision, cost, and savings are
recorded to your telemetry and visible in the dashboard.
Streaming
Withstream: true you receive the usual OpenAI chat.completion.chunk events.
Orbitrage transforms the stream so your SDK gets a clean OpenAI feed:
- internal routing/summary events are captured and stripped (the SDK never sees unknown event types), and
- provider frames with empty
choices(e.g. content-filter pre-frames) are rewritten so SDKs don’t crash onchoices[0].
X-Orbitrage-Overhead-Ms and X-Orbitrage-BYOK from the response headers
to see routing overhead and whether your own key was used.