Skip to main content
Set stream: true as usual. Orbitrage streams the response straight through and records the call when it finishes.
import os, orbitrage
orbitrage.init(os.environ["ORBITRAGE_API_KEY"], user_id="customer_42")

from openai import OpenAI
stream = OpenAI().chat.completions.create(
    model="grok-4-fast",
    messages=[{"role": "user", "content": "Count to five."}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)
Orbitrage strips its internal routing events from the stream and rewrites empty-choices frames, so SDKs never crash mid-stream. You just get the normal OpenAI chunks.
Streamed calls also record first-token latency and inter-token p50/p95, visible on the call’s detail in the dashboard.