Set stream: true as usual. Orbitrage streams the response straight through and
records the call when it finishes.
import os, orbitrage
orbitrage.init(os.environ["ORBITRAGE_API_KEY"], user_id="customer_42")
from openai import OpenAI
stream = OpenAI().chat.completions.create(
model="grok-4-fast",
messages=[{"role": "user", "content": "Count to five."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
Orbitrage strips its internal routing events from the stream and rewrites
empty-choices frames, so SDKs never crash mid-stream. You just get the
normal OpenAI chunks.
Streamed calls also record first-token latency and inter-token p50/p95, visible
on the call’s detail in the dashboard.