Models

Orbitrage’s catalog has two access modes, and every chat model is in exactly one of them:

Access mode	Which models	Who bills you
Served by Orbitrage (pooled)	Every open-weight model — the Amazon Bedrock catalog, Fireworks, Cerebras, and the open Azure AI Foundry deployments (MiniMax / Kimi / DeepSeek)	Orbitrage credits, at the upstream price + a 2.5% infra fee
BYOK-only	Every closed-weight frontier chat model: `claude-`, `gpt-` (except `gpt-oss-`), `gemini-`, `grok-*`	Your provider, directly, at your rate. Orbitrage charges $0

BYOK models are still fully first-class — same routing, tracing, cost analytics, guardrails and Tools Gateway — they just need an enabled provider key. Calling one without a key returns 403 byok_key_required; there is no silent pooled fallback. See BYOK.

Provider brand does not determine access mode. OpenAI’s gpt-oss-* models (including gpt-oss-safeguard-*) and Google’s gemma-* models are open-weight — Orbitrage serves and bills those normally. Only the closed frontier lines listed above are BYOK.

Let the router pick (model: "auto") or pin any model by id. Auto routing only ever selects models your organization can actually reach, so it never returns byok_key_required.

Open-source models are addressed by their bare id — e.g. gemma-3-27b-it, not google.gemma-3-27b-it. Orbitrage maps it to the upstream id for you.

The catalog below is representative. The live, authoritative list for your account is always GET https://api.orbitrage.ai/v1/models — see List models. Each entry carries a byok_required flag so you can tell the two modes apart programmatically.

Pricing & tiers

Prices are USD per 1M tokens (input / output).

For served models, that’s the upstream price; your credits are debited at that price plus the 2.5% infra fee.
For BYOK models, that’s the provider’s own list rate — shown for comparison only. Your provider bills you (possibly less, at your negotiated rate); Orbitrage debits nothing.

Frontier — BYOK only

Requires an enabled Anthropic key.

Model	Provider	Input	Output	Vision	Context
`claude-fable-5`	Anthropic	$10.00	$50.00	✓	200K
`claude-opus-4-8`	Anthropic	$5.00	$25.00	✓	200K
`claude-opus-4-7`	Anthropic	$5.00	$25.00	✓	200K
`claude-sonnet-5`	Anthropic	$2.00	$10.00	✓	1M
`claude-sonnet-4-6`	Anthropic	$3.00	$15.00	✓	200K

Requires an enabled OpenAI key.

Model	Provider	Input	Output	Vision	Context
`gpt-5.5`	OpenAI	$5.00	$20.00	✓	400K
`gpt-5.4`	OpenAI	$1.25	$5.00	✓	400K
`gpt-5.3-chat`	OpenAI	$1.50	$6.00	✓	400K
`gpt-5.4-nano`	OpenAI	$0.20	$1.25	✓	256K
`gpt-5.4-mini`	OpenAI	$0.10	$0.40	✓	256K
`gpt-5-nano`	OpenAI	$0.10	$0.40	✓	—
`gpt-4o`	OpenAI	$2.50	$10.00	✓	128K
`gpt-4o-mini`	OpenAI	$0.165	$0.66	✓	—

The full BYOK list also includes gpt-5.5-pro, gpt-5.4-pro, gpt-5.3, gpt-5.3-codex, gpt-5.2, gpt-5.2-codex, gpt-5.1, gpt-5, gpt-5-mini, gpt-chat-latest, gpt-4-turbo, gpt-4, gpt-3.5-turbo, o1, o1-mini, o3, o3-mini and o4-mini. Requires an enabled Google key: the gemini-* family. Requires an enabled xAI key: the grok-* family (grok-4, grok-4.3, grok-4-fast, grok-3 variants, …). claude-sonnet-5 is priced at its introductory rate of

2 /

10 per 1M tokens through Aug 31, 2026 (standard

3 /

15 after). Like Opus 4.7+/Fable 5 it uses adaptive thinking and ignores temperature/top_p — Orbitrage strips them for you. claude-fable-5 is Anthropic’s top-tier model (above Opus) — the highest-quality option in the catalog, used for the hardest reasoning and agentic tasks.

High — served by Orbitrage

Model	Provider	Input	Output	Vision	Context
`Kimi-K2.6`	Azure Foundry	$0.95	$4.00	✓	200K
`DeepSeek-V3.2`	Azure Foundry	$0.40	$1.10	—	131K
`glm-5.2`	Fireworks	$1.40	$4.40	—	1M

Mid — served by Orbitrage

Model	Provider	Input	Output	Vision	Context
`FW-MiniMax-M2.5`	Azure Foundry	$0.30	$1.20	—	262K
`DeepSeek-V4-Flash`	Azure Foundry	$0.14	$0.28	—	131K
`minimax-m3`	Fireworks	$0.30	$1.20	✓	512K

Basic — served by Orbitrage

Model	Provider	Input	Output	Vision	Context
`gpt-oss-20b`	Bedrock	$0.07	$0.30	—	128K
`nemotron-nano-9b-v2`	Bedrock	$0.06	$0.23	—	128K
`glm-4.7-flash`	Bedrock	$0.07	$0.40	—	203K
`gemma-3-4b-it`	Bedrock	$0.04	$0.08	✓	128K

Open source (Amazon Bedrock, served by Orbitrage)

A broad open-weights catalog, billed to your Orbitrage credits with the same 2.5% infra fee. Address each by its bare id (the engine adds the vendor prefix upstream). Vision models accept image content blocks.

Model	Vendor	Input	Output	Vision	Context
`gemma-3-27b-it`	Google	$0.23	$0.38	✓	128K
`gemma-3-12b-it`	Google	$0.09	$0.29	✓	128K
`gemma-3-4b-it`	Google	$0.04	$0.08	✓	128K
`qwen3-vl-235b-a22b-instruct`	Qwen	$0.53	$2.66	✓	256K
`qwen3-coder-next`	Qwen	$0.50	$1.20	—	256K
`qwen3-coder-480b-a35b-instruct`	Qwen	$0.45	$1.80	—	128K
`qwen3-coder-30b-a3b-instruct`	Qwen	$0.15	$0.60	—	256K
`qwen3-next-80b-a3b-instruct`	Qwen	$0.14	$1.20	—	256K
`qwen3-235b-a22b-2507`	Qwen	$0.22	$0.88	—	256K
`qwen3-32b`	Qwen	$0.15	$0.60	—	32K
`deepseek-v3.2`	DeepSeek	$0.62	$1.85	—	164K
`deepseek-v3.1`	DeepSeek	$0.58	$1.68	—	128K
`glm-5.2`	Z.AI	$1.40	$4.40	—	1M
`glm-5`	Z.AI	$1.00	$3.20	—	200K
`glm-4.7`	Z.AI	$0.60	$2.20	—	203K
`glm-4.7-flash`	Z.AI	$0.07	$0.40	—	203K
`glm-4.6`	Z.AI	$0.60	$2.20	—	203K
`kimi-k2.7-code`	Moonshot	$0.95	$4.00	✓	262K
`kimi-k2.5`	Moonshot	$0.60	$3.00	✓	256K
`kimi-k2-thinking`	Moonshot	$0.60	$2.50	—	256K
`minimax-m3`	MiniMax	$0.30	$1.20	✓	512K
`minimax-m2.5`	MiniMax	$0.30	$1.20	—	196K
`minimax-m2.1`	MiniMax	$0.30	$1.20	—	196K
`minimax-m2`	MiniMax	$0.30	$1.20	—	1M
`nemotron-super-3-120b`	NVIDIA	$0.15	$0.65	—	256K
`nemotron-nano-3-30b`	NVIDIA	$0.06	$0.24	—	256K
`nemotron-nano-12b-v2`	NVIDIA	$0.20	$0.60	✓	128K
`nemotron-nano-9b-v2`	NVIDIA	$0.06	$0.23	—	128K
`mistral-large-3-675b-instruct`	Mistral	$0.50	$1.50	✓	256K
`devstral-2-123b`	Mistral	$0.40	$2.00	—	256K
`magistral-small-2509`	Mistral	$0.50	$1.50	✓	128K
`ministral-3-14b-instruct`	Mistral	$0.20	$0.20	✓	128K
`ministral-3-8b-instruct`	Mistral	$0.15	$0.15	✓	128K
`ministral-3-3b-instruct`	Mistral	$0.10	$0.10	✓	128K
`voxtral-small-24b-2507`	Mistral	$0.10	$0.30	—	32K
`voxtral-mini-3b-2507`	Mistral	$0.04	$0.04	—	32K
`palmyra-vision-7b`	Writer	$0.15	$0.60	✓	4K
`gpt-oss-120b`	OpenAI	$0.15	$0.60	—	128K
`gpt-oss-20b`	OpenAI	$0.07	$0.30	—	128K
`gpt-oss-safeguard-120b`	OpenAI	$0.15	$0.60	—	128K
`gpt-oss-safeguard-20b`	OpenAI	$0.07	$0.20	—	128K

Ultra-fast (Cerebras — Pro plan only)

Cerebras runs open models on wafer-scale hardware at ~1,000–3,000 tokens/sec — near-instant responses. These are addressed by a distinct -cerebras id so they never collide with the standard (Bedrock) versions.

Cerebras models require the Pro plan. A free-plan request for one returns 403 pro_plan_required; on the Models page they’re marked Pro only and can’t be enabled. They’re also explicit-pin only — model: "auto" never routes to them.

Model	Backing model	Speed	Input	Output	Vision	Context
`gpt-oss-120b-cerebras`	GPT-OSS 120B	~3000 tok/s	$0.50	$1.38	—	128K
`glm-4.7-cerebras`	Z.AI GLM 4.7	~1000 tok/s	$1.20	$4.40	—	128K
`gemma-4-31b-cerebras`	Gemma 4 31B	~1850 tok/s	$0.70	$1.10	✓	128K

Image

Model	Provider	Billing
`gpt-image-2`	Azure	Per image

Audio (managed, included)

Speech-to-text and text-to-speech are offered as a managed service on Deepgram — no BYOK needed. See Audio.

Model	Provider	Type	Billing
`nova-3`	Deepgram	Speech-to-text	Per minute of audio
`nova-3-multilingual`	Deepgram	Speech-to-text	Per minute of audio
`nova-2`	Deepgram	Speech-to-text	Per minute of audio
`aura-2-thalia-en`	Deepgram	Text-to-speech	Per 1,000 characters

The full catalog also includes the DeepSeek R1 reasoning model and the whole open-source Bedrock catalog above (served), plus the remaining GPT-5, Gemini and Grok variants (BYOK). Query /v1/models for everything enabled on your account — each entry’s byok_required flag tells you which mode it’s in.

Cost, savings, and the infra fee

Every call records four cost figures, so you can see exactly where your money goes:

Field	Meaning
Cost (`cost_usd`)	What Orbitrage bills you — upstream price plus the 2.5% infra fee. Always `0` on a BYOK call.
Provider cost	The raw upstream price Orbitrage paid (internal). `0` on a BYOK call — your provider billed you, not us.
Baseline cost	What the same call would have cost on a frontier baseline (`claude-sonnet-4-6`).
Saved (`saved_usd`)	`baseline − cost` — the routing savings, never negative.

The baseline lets the dashboard answer “how much did routing save me?” by comparing every routed call against a single frontier model.

cost_usd = 0 covers the LLM tokens only. If the turn invoked the managed Tools Gateway, those tool calls ran on Orbitrage’s pooled tool keys and are billed normally — so a BYOK turn’s cost is exactly its tool cost.

Vision & multimodal

Vision-capable models accept image content blocks in the standard OpenAI format, and the router prefers one automatically when your prompt contains images. The engine also exposes image generation/editing and audio (transcription, translation, speech) endpoints — see the API reference.

Pinning vs. aliases

Pin a model by passing its exact id — no scoring, straight to that model.
Route by passing auto (or router / default / orbitrage).
Model ids match flexibly — claude-sonnet-4.6 and claude-sonnet-4-6 resolve to the same model, and family names (e.g. claude-opus, grok-4) resolve to the appropriate provider and tier.
Pinning a BYOK model without an enabled key for its vendor fails fast with 403 byok_key_required rather than quietly routing somewhere else. Pin an open-weight id, or use auto, if you’d rather never see that error.

Get Started

Core Concepts

SDKs

Integrations

Examples

Dashboard

Platform

Account & Billing

Pricing & tiers

Frontier — BYOK only

High — served by Orbitrage

Mid — served by Orbitrage

Basic — served by Orbitrage

Open source (Amazon Bedrock, served by Orbitrage)

Ultra-fast (Cerebras — Pro plan only)

Image

Audio (managed, included)

Cost, savings, and the infra fee

Vision & multimodal

Pinning vs. aliases

​Pricing & tiers

​Frontier — BYOK only

​High — served by Orbitrage

​Mid — served by Orbitrage

​Basic — served by Orbitrage

​Open source (Amazon Bedrock, served by Orbitrage)

​Ultra-fast (Cerebras — Pro plan only)

​Image

​Audio (managed, included)

​Cost, savings, and the infra fee

​Vision & multimodal

​Pinning vs. aliases

Pricing & tiers

Frontier — BYOK only

High — served by Orbitrage

Mid — served by Orbitrage

Basic — served by Orbitrage

Open source (Amazon Bedrock, served by Orbitrage)

Ultra-fast (Cerebras — Pro plan only)

Image

Audio (managed, included)

Cost, savings, and the infra fee

Vision & multimodal

Pinning vs. aliases