Skip to content

Commit 62a9111

Browse files
committed
Update docs surrounding OAI compliant endpoints. Add section on newly supported responses endpoints.
1 parent 167ed0f commit 62a9111

4 files changed

Lines changed: 214 additions & 14 deletions

File tree

docs/docs.json

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,8 @@
5252
"icon": "code",
5353
"pages": [
5454
"http-reference/examples/openai-compliant/chatCompletionsExample",
55-
"http-reference/examples/openai-compliant/completionsExample"
55+
"http-reference/examples/openai-compliant/completionsExample",
56+
"http-reference/examples/openai-compliant/responsesExample"
5657
]
5758
},
5859
{
@@ -165,7 +166,8 @@
165166
"group": "Open AI Completions",
166167
"pages": [
167168
"http-reference/examples/openai-compliant/chatCompletionsExample",
168-
"http-reference/examples/openai-compliant/completionsExample"
169+
"http-reference/examples/openai-compliant/completionsExample",
170+
"http-reference/examples/openai-compliant/responsesExample"
169171
]
170172
},
171173
{

docs/http-reference/examples/openai-compliant/chatCompletionsExample.mdx

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,7 @@ To specify a provider, prefix the model with the provider, e.g. `gpt-4` should b
1111

1212
We provide access to models from `openai`, `mistral`, and `google`.
1313

14-
You will need to supply a header `provider-key` in order to make requests to `anthropic`, and `cohere` models.
15-
16-
e.g. If you are trying to run `anthropic/claude-sonnet-4-5`, `provider-key` will be an Anthropic key.
17-
18-
For unlimited rate limits you will need to supply a header `provider-key`.
14+
You will need to supply a header `provider-key` in order to make requests to `cohere` models.
1915

2016
**NOTE:** Logprobs are supported for all models!
2117

docs/http-reference/examples/openai-compliant/completionsExample.mdx

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,7 @@ To specify a provider, prefix the model with the provider, e.g. `davinci-002` sh
1111

1212
We provide access to models from `openai`, `mistral`, and `google`.
1313

14-
You will need to supply a header `provider-key` in order to make requests to `anthropic`, and `cohere` models.
15-
16-
For unlimited rate limits you will need to supply a header `provider-key`.
17-
18-
e.g. If you are trying to run `openai/davinci-002` with unlimited rate limits, `provider-key` will be an Open AI key.
19-
20-
If it were an `anthropic` model it would be an Anthropic key.
14+
You will need to supply a header `provider-key` in order to make requests to `cohere` models.
2115

2216
**NOTE:** Logprobs are supported for all models!
2317

Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
---
2+
title: 'Responses'
3+
description: Use the OpenAI-compatible Responses endpoint via OpenAI clients, supporting streaming, tool calling, and reasoning ("thinking") parameters.
4+
icon: 'robot'
5+
mode: 'wide'
6+
---
7+
8+
Provides Responses for all closed source providers: `openai`, and `anthropic`.
9+
10+
The Responses API is the unified successor to Chat Completions: you send `input` (text, images, files, tool outputs, etc.) and receive a `response` object that can contain messages, tool calls, and (for reasoning models) reasoning items.
11+
12+
**Note**, anthropic does not yet support tool calls.
13+
14+
To specify a provider, prefix the model with the provider. For example, `gpt-5.1` should be passed as `openai/gpt-5.1`.
15+
16+
## Thinking (Reasoning) parameters
17+
18+
Some OpenAI reasoning models (e.g. `openai/gpt-5.x`, `openai/o3`, `openai/o4-mini`) support the `reasoning` object:
19+
20+
- `reasoning.effort`: `"none" | "low" | "medium" | "high" | ...` (model-dependent)
21+
- `reasoning.summary`: `"none" | "auto" | "detailed"` (optional)
22+
23+
All Anthropic models should support thinking.
24+
25+
⚠️ `max_output_tokens` limits **reasoning tokens + visible output tokens**, so if you increase `reasoning.effort`, consider raising `max_output_tokens`.
26+
27+
<AccordionGroup>
28+
<Accordion defaultOpen="false" title="Basic usage (Closed Source + Thinking)">
29+
30+
<CodeGroup>
31+
```javascript javascript
32+
import OpenAI from "openai";
33+
34+
const client = new OpenAI({
35+
apiKey: "BYTEZ_KEY",
36+
baseURL: "https://api.bytez.com/models/v2/openai/v1"
37+
});
38+
39+
const response = await client.responses.create({
40+
model: "openai/gpt-5.1",
41+
input: [
42+
{ role: "system", content: "You are a friendly chatbot" },
43+
{ role: "user", content: "Hello bot, what is the capital of England?" }
44+
],
45+
46+
// Thinking controls (OpenAI reasoning models only)
47+
reasoning: {
48+
effort: "medium", // try: "none" | "low" | "medium" | "high" (model-dependent)
49+
summary: "auto" // "none" | "auto" | "detailed"
50+
},
51+
52+
// Caps reasoning + visible output tokens together
53+
max_output_tokens: 300
54+
});
55+
56+
console.log("Answer:", response.output_text);
57+
58+
// Optional: read a reasoning summary item (if requested & returned)
59+
const reasoningItem = response.output?.find((it) => it.type === "reasoning");
60+
if (reasoningItem?.summary?.length) {
61+
console.log("Reasoning summary:", reasoningItem.summary.map(s => s.text).join("\n"));
62+
}
63+
```
64+
```python python
65+
from openai import OpenAI
66+
67+
client = OpenAI(
68+
api_key="BYTEZ_KEY",
69+
base_url="https://api.bytez.com/models/v2/openai/v1"
70+
)
71+
72+
response = client.responses.create(
73+
model="openai/gpt-5.1",
74+
input=[
75+
{"role": "system", "content": "You are a friendly chatbot"},
76+
{"role": "user", "content": "Hello bot, what is the capital of England?"},
77+
],
78+
reasoning={
79+
"effort": "medium",
80+
"summary": "auto",
81+
},
82+
max_output_tokens=300,
83+
84+
)
85+
86+
print("Answer:", response.output_text)
87+
88+
reasoning_items = [it for it in response.output if it.type == "reasoning"]
89+
if reasoning_items and getattr(reasoning_items[0], "summary", None):
90+
print("Reasoning summary:", reasoning_items[0].summary[0].text)
91+
```
92+
```bash http
93+
curl -X POST 'https://api.bytez.com/models/v2/openai/v1/responses' \
94+
-H 'Authorization: BYTEZ_KEY' \
95+
-H 'provider-key: PROVIDER_KEY' \
96+
-H 'Content-Type: application/json' \
97+
--data '{
98+
"model": "openai/gpt-5.1",
99+
"input": [
100+
{"role": "system", "content": "You are a friendly chatbot"},
101+
{"role": "user", "content": "Hello bot, what is the capital of England?"}
102+
],
103+
"reasoning": { "effort": "medium", "summary": "auto" },
104+
"max_output_tokens": 300
105+
}'
106+
```
107+
</CodeGroup>
108+
109+
</Accordion>
110+
111+
<Accordion defaultOpen="false" title="Streaming (Closed Source + Thinking + Reasoning Summary Events)">
112+
113+
<CodeGroup>
114+
```javascript javascript
115+
import OpenAI from "openai";
116+
117+
const client = new OpenAI({
118+
apiKey: "BYTEZ_KEY",
119+
baseURL: "https://api.bytez.com/models/v2/openai/v1"
120+
});
121+
122+
const stream = await client.responses.create({
123+
model: "openai/gpt-5.1",
124+
input: [
125+
{ role: "system", content: "You are a friendly chatbot" },
126+
{ role: "user", content: "Hello bot, what is the capital of England?" }
127+
],
128+
reasoning: { effort: "medium", summary: "auto" },
129+
max_output_tokens: 400,
130+
stream: true
131+
});
132+
133+
let text = "";
134+
let reasoningSummary = "";
135+
136+
for await (const event of stream) {
137+
if (event.type === "response.output_text.delta") {
138+
text += event.delta;
139+
process.stdout.write(event.delta);
140+
}
141+
142+
// Optional: stream reasoning summary text (if enabled by reasoning.summary)
143+
if (event.type === "response.reasoning_summary_text.delta") {
144+
reasoningSummary += event.delta;
145+
}
146+
147+
if (event.type === "response.completed") break;
148+
}
149+
150+
console.log("\n\nFinal:", { text });
151+
if (reasoningSummary) console.log("\nReasoning summary:\n", reasoningSummary);
152+
```
153+
```python python
154+
from openai import OpenAI
155+
156+
client = OpenAI(
157+
api_key="BYTEZ_KEY",
158+
base_url="https://api.bytez.com/models/v2/openai/v1"
159+
)
160+
161+
stream = client.responses.create(
162+
model="openai/gpt-5.1",
163+
input=[
164+
{"role": "system", "content": "You are a friendly chatbot"},
165+
{"role": "user", "content": "Hello bot, what is the capital of England?"},
166+
],
167+
reasoning={"effort": "medium", "summary": "auto"},
168+
max_output_tokens=400,
169+
stream=True,
170+
)
171+
172+
text = ""
173+
reasoning_summary = ""
174+
175+
for event in stream:
176+
if event.type == "response.output_text.delta":
177+
text += event.delta
178+
print(event.delta, end="", flush=True)
179+
elif event.type == "response.reasoning_summary_text.delta":
180+
reasoning_summary += event.delta
181+
elif event.type == "response.completed":
182+
break
183+
184+
print("\n\nFinal:", {"text": text})
185+
if reasoning_summary:
186+
print("\nReasoning summary:\n", reasoning_summary)
187+
```
188+
```bash http
189+
curl -N -X POST 'https://api.bytez.com/models/v2/openai/v1/responses' \
190+
-H 'Authorization: BYTEZ_KEY' \
191+
-H 'provider-key: PROVIDER_KEY' \
192+
-H 'Content-Type: application/json' \
193+
--data '{
194+
"model": "openai/gpt-5.1",
195+
"input": [
196+
{"role": "system", "content": "You are a friendly chatbot"},
197+
{"role": "user", "content": "Hello bot, what is the capital of England?"}
198+
],
199+
"reasoning": { "effort": "medium", "summary": "auto" },
200+
"max_output_tokens": 400,
201+
"stream": true
202+
}'
203+
```
204+
</CodeGroup>
205+
206+
</Accordion>
207+
208+
</AccordionGroup>

0 commit comments

Comments
 (0)