Skip to content

Commit 49377db

Browse files
authored
fix(providers): unify OPENAI_TIMEOUT_MS + AGENTMEMORY_LLM_TIMEOUT_MS (#446) (#453)
* fix(providers): unify OPENAI_TIMEOUT_MS + AGENTMEMORY_LLM_TIMEOUT_MS (#446) v0.9.17 shipped OPENAI_TIMEOUT_MS scoped to the OpenAI LLM provider (inline AbortController, default 60s). PR #379 then shipped AGENTMEMORY_LLM_TIMEOUT_MS in the shared src/providers/_fetch.ts helper used by every other raw-fetch provider (Gemini, OpenRouter, MiniMax, OpenAI/Cohere/Voyage/OpenRouter embedding). Two env vars, same value, different names — ops confusion. Unify on the global name while keeping back-compat: 1. OPENAI_TIMEOUT_MS — OpenAI-scoped alias, takes precedence 2. AGENTMEMORY_LLM_TIMEOUT_MS — global fall-back across providers 3. 60 000 ms default The OpenAI LLM provider now routes through the shared fetchWithTimeout helper, dropping ~30 lines of duplicate AbortController + clearTimeout plumbing. Existing users with OPENAI_TIMEOUT_MS set keep the exact v0.9.17 behaviour; new users setting AGENTMEMORY_LLM_TIMEOUT_MS get the OpenAI LLM path covered too. README + .env.example now document AGENTMEMORY_LLM_TIMEOUT_MS as the canonical name and note OPENAI_TIMEOUT_MS as the OpenAI-scoped alias. 4 new precedence tests in test/fetch-timeout.test.ts cover all four env-var combinations. * test(providers): strict parse for OPENAI_TIMEOUT_MS env (CodeRabbit) CodeRabbit caught parseInt("30ms", 10) silently returning 30 in the timeout-resolve path. Real users hitting this would think they bound the call to 30ms when the regex would have rejected it. parsePositiveInt now rejects anything that isn't pure digits via /^\d+$/ (after trim). parseInt's lenience on trailing units / underscores / signs is gone — those fall back to the 60s default instead of masquerading as an aggressive bound. New regression test covers "30ms", "1_000", "60s", "30abc", "-30", "0". Whitespace padding (e.g. " 30 ") is still accepted — that's normal env-var handling. 992/992 tests pass on the worktree.
1 parent 00df540 commit 49377db

4 files changed

Lines changed: 148 additions & 27 deletions

File tree

.env.example

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,12 @@
4545

4646
# MAX_TOKENS=4096 # Cap LLM completion tokens for compression / summarise calls
4747

48+
# Outbound LLM / embedding timeout — shared across every raw-fetch provider
49+
# (Gemini, OpenRouter, MiniMax, OpenAI LLM, and OpenAI/Cohere/Voyage/OpenRouter
50+
# embedding). The OpenAI LLM path also honours the OpenAI-scoped
51+
# OPENAI_TIMEOUT_MS alias for back-compat with v0.9.17 (precedence).
52+
# AGENTMEMORY_LLM_TIMEOUT_MS=60000 # Default: 60 000 ms (60 s)
53+
4854
# Opt-in Claude-subscription fallback (spawns @anthropic-ai/claude-agent-sdk
4955
# child sessions). Off by default — the agent-sdk fallback can trigger
5056
# Stop-hook recursion (#149 follow-up) when invoked from inside Claude Code.

README.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1058,7 +1058,10 @@ Create `~/.agentmemory/.env`:
10581058
# # api-key header + api-version query param.
10591059
# OPENAI_API_VERSION=2024-08-01-preview # Optional: Azure api-version query param
10601060
# OPENAI_MODEL=gpt-4o-mini # Optional: default model
1061-
# OPENAI_TIMEOUT_MS=60000 # Optional: outbound fetch timeout (default 60s)
1061+
# OPENAI_TIMEOUT_MS=60000 # Optional: OpenAI-scoped alias for the outbound fetch
1062+
# # timeout. Takes precedence over AGENTMEMORY_LLM_TIMEOUT_MS
1063+
# # for back-compat with v0.9.17. New configs should
1064+
# # prefer the global AGENTMEMORY_LLM_TIMEOUT_MS below.
10621065
# OPENAI_REASONING_EFFORT=none # Optional: "low" | "medium" | "high" | "none"
10631066
# # Honored only by OpenAI's reasoning models (o1, o3,
10641067
# # gpt-*-reasoning) and providers that mirror that
@@ -1083,7 +1086,11 @@ Create `~/.agentmemory/.env`:
10831086
# Outbound LLM / embedding timeout
10841087
# AGENTMEMORY_LLM_TIMEOUT_MS=60000 # Default: 60 000 ms (60 s). Applies to every
10851088
# raw-fetch provider (Gemini, OpenRouter, MiniMax,
1086-
# OpenAI/Cohere/Voyage/OpenRouter embedding).
1089+
# OpenAI LLM, OpenAI/Cohere/Voyage/OpenRouter
1090+
# embedding). For the OpenAI LLM path, the
1091+
# OpenAI-scoped OPENAI_TIMEOUT_MS alias (above)
1092+
# takes precedence when set, for back-compat
1093+
# with v0.9.17.
10871094
# Increase for slow networks or large batch calls;
10881095
# decrease to fail-fast on rate-limit holds.
10891096

src/providers/openai.ts

Lines changed: 52 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import type { MemoryProvider } from "../types.js";
22
import { getEnvVar } from "../config.js";
3+
import { fetchWithTimeout } from "./_fetch.js";
34

45
const DEFAULT_BASE_URL = "https://api.openai.com";
56
const DEFAULT_MODEL = "gpt-4o-mini";
@@ -25,7 +26,12 @@ const DEFAULT_AZURE_API_VERSION = "2024-08-01-preview";
2526
* Azure: https://<resource>.openai.azure.com/openai/deployments/<deployment>
2627
* OPENAI_MODEL — model name (default: gpt-4o-mini)
2728
* OPENAI_API_VERSION — Azure api-version query param (default: 2024-08-01-preview)
28-
* OPENAI_TIMEOUT_MS — outbound fetch timeout in ms (default: 60000)
29+
* OPENAI_TIMEOUT_MS — outbound fetch timeout in ms (OpenAI-scoped alias,
30+
* takes precedence over AGENTMEMORY_LLM_TIMEOUT_MS
31+
* for back-compat with the v0.9.17 shipping name).
32+
* AGENTMEMORY_LLM_TIMEOUT_MS — outbound fetch timeout in ms shared across all
33+
* raw-fetch LLM + embedding providers. Used when
34+
* OPENAI_TIMEOUT_MS is not set. Default: 60000.
2935
* MAX_TOKENS — max output tokens (default: from config or 4096)
3036
* OPENAI_REASONING_EFFORT — "low" | "medium" | "high" | "none"
3137
* Passthrough for reasoning models (e.g. Ollama Cloud
@@ -54,7 +60,7 @@ export class OpenAIProvider implements MemoryProvider {
5460
DEFAULT_BASE_URL
5561
).replace(/\/+$/, "");
5662
this.reasoningEffort = getEnvVar("OPENAI_REASONING_EFFORT") || undefined;
57-
this.timeoutMs = parseTimeout(getEnvVar("OPENAI_TIMEOUT_MS"));
63+
this.timeoutMs = resolveTimeout();
5864
this.azureApiVersion =
5965
getEnvVar("OPENAI_API_VERSION") || DEFAULT_AZURE_API_VERSION;
6066
this.isAzure = detectAzure(this.baseUrl);
@@ -107,33 +113,31 @@ export class OpenAIProvider implements MemoryProvider {
107113
body.reasoning_effort = this.reasoningEffort;
108114
}
109115

110-
// Bound the request with an AbortController so a hung provider
111-
// can't stall the worker. The other raw-fetch providers
112-
// (anthropic, gemini, openrouter, minimax) have the same gap
113-
// tracked in a follow-up issue; this PR fixes it for the new
114-
// surface only.
115-
const ac = new AbortController();
116-
const t = setTimeout(() => ac.abort(), this.timeoutMs);
116+
// Bound the request via the shared fetchWithTimeout helper, which
117+
// owns the AbortController + clearTimeout cleanup for every raw-fetch
118+
// provider (minimax, openrouter, gemini, openrouter-embed, etc.).
119+
// OPENAI_TIMEOUT_MS keeps its v0.9.17 meaning (OpenAI-scoped alias,
120+
// takes precedence); when unset we fall through to
121+
// AGENTMEMORY_LLM_TIMEOUT_MS and finally the 60s default. See #446.
117122
let response: Response;
118123
try {
119-
response = await fetch(url, {
120-
method: "POST",
121-
headers: this.buildHeaders(),
122-
body: JSON.stringify(body),
123-
signal: ac.signal,
124-
});
124+
response = await fetchWithTimeout(
125+
url,
126+
{
127+
method: "POST",
128+
headers: this.buildHeaders(),
129+
body: JSON.stringify(body),
130+
},
131+
this.timeoutMs,
132+
);
125133
} catch (err) {
126-
const aborted =
127-
ac.signal.aborted ||
128-
(err instanceof Error && err.name === "AbortError");
134+
const aborted = err instanceof Error && err.name === "AbortError";
129135
if (aborted) {
130136
throw new Error(
131-
`OpenAI API request timed out after ${this.timeoutMs}ms — set OPENAI_TIMEOUT_MS to raise the bound or check the provider status.`,
137+
`OpenAI API request timed out after ${this.timeoutMs}ms — set OPENAI_TIMEOUT_MS (or AGENTMEMORY_LLM_TIMEOUT_MS) to raise the bound or check the provider status.`,
132138
);
133139
}
134140
throw err;
135-
} finally {
136-
clearTimeout(t);
137141
}
138142

139143
if (!response.ok) {
@@ -160,10 +164,33 @@ export class OpenAIProvider implements MemoryProvider {
160164
}
161165
}
162166

163-
function parseTimeout(raw: string | null | undefined): number {
164-
if (!raw) return DEFAULT_TIMEOUT_MS;
165-
const n = parseInt(raw, 10);
166-
return Number.isFinite(n) && n > 0 ? n : DEFAULT_TIMEOUT_MS;
167+
// Resolves the outbound-fetch timeout for the OpenAI LLM path.
168+
// Precedence (preserving v0.9.17 behaviour):
169+
// 1. OPENAI_TIMEOUT_MS — OpenAI-scoped alias (back-compat)
170+
// 2. AGENTMEMORY_LLM_TIMEOUT_MS — global LLM/embedding timeout (#446)
171+
// 3. 60 000 ms default
172+
function resolveTimeout(): number {
173+
const openaiRaw = getEnvVar("OPENAI_TIMEOUT_MS");
174+
const openai = parsePositiveInt(openaiRaw);
175+
if (openai !== undefined) return openai;
176+
177+
const globalRaw = getEnvVar("AGENTMEMORY_LLM_TIMEOUT_MS");
178+
const globalMs = parsePositiveInt(globalRaw);
179+
if (globalMs !== undefined) return globalMs;
180+
181+
return DEFAULT_TIMEOUT_MS;
182+
}
183+
184+
function parsePositiveInt(raw: string | null | undefined): number | undefined {
185+
if (!raw) return undefined;
186+
const trimmed = raw.trim();
187+
// Reject malformed values like "30ms" or "1_000" — parseInt would
188+
// silently return 30 / 1, swallowing user typos as valid timeouts.
189+
// The regex enforces pure digits (no sign, no trailing units, no
190+
// separators) before we hand off to Number.
191+
if (!/^\d+$/.test(trimmed)) return undefined;
192+
const n = Number(trimmed);
193+
return Number.isFinite(n) && n > 0 ? n : undefined;
167194
}
168195

169196
function detectAzure(baseUrl: string): boolean {

test/fetch-timeout.test.ts

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
22
import { fetchWithTimeout } from "../src/providers/_fetch.js";
33
import { MinimaxProvider } from "../src/providers/minimax.js";
44
import { OpenRouterProvider } from "../src/providers/openrouter.js";
5+
import { OpenAIProvider } from "../src/providers/openai.js";
56
import { GeminiEmbeddingProvider } from "../src/providers/embedding/gemini.js";
67
import { OpenAIEmbeddingProvider } from "../src/providers/embedding/openai.js";
78
import { CohereEmbeddingProvider } from "../src/providers/embedding/cohere.js";
@@ -195,3 +196,83 @@ describe("Provider hang regression — OpenRouterEmbeddingProvider", () => {
195196
await expect(provider.embedBatch(["hello"])).rejects.toThrow();
196197
});
197198
});
199+
200+
// ─────────────────────────────────────────────────────────────
201+
// #446 — OpenAI LLM provider env-var precedence
202+
//
203+
// v0.9.17 shipped OPENAI_TIMEOUT_MS (OpenAI-scoped). PR #379 then
204+
// shipped AGENTMEMORY_LLM_TIMEOUT_MS (shared). The provider now
205+
// honours both: OPENAI_TIMEOUT_MS wins for back-compat, with
206+
// AGENTMEMORY_LLM_TIMEOUT_MS as the global fall-back.
207+
// ─────────────────────────────────────────────────────────────
208+
describe("OpenAIProvider timeout env precedence (#446)", () => {
209+
beforeEach(() => {
210+
delete process.env["OPENAI_TIMEOUT_MS"];
211+
delete process.env["AGENTMEMORY_LLM_TIMEOUT_MS"];
212+
vi.spyOn(globalThis, "fetch").mockImplementation(hangingFetch as typeof fetch);
213+
});
214+
afterEach(() => {
215+
vi.restoreAllMocks();
216+
delete process.env["OPENAI_TIMEOUT_MS"];
217+
delete process.env["AGENTMEMORY_LLM_TIMEOUT_MS"];
218+
});
219+
220+
it("OPENAI_TIMEOUT_MS alone aborts the OpenAI LLM call", async () => {
221+
process.env["OPENAI_TIMEOUT_MS"] = "30";
222+
const provider = new OpenAIProvider("test-key", "gpt-4o-mini", 1024);
223+
await expect(provider.compress("system", "user")).rejects.toThrow(
224+
/timed out after 30ms/,
225+
);
226+
});
227+
228+
it("AGENTMEMORY_LLM_TIMEOUT_MS alone aborts the OpenAI LLM call", async () => {
229+
process.env["AGENTMEMORY_LLM_TIMEOUT_MS"] = "30";
230+
const provider = new OpenAIProvider("test-key", "gpt-4o-mini", 1024);
231+
await expect(provider.compress("system", "user")).rejects.toThrow(
232+
/timed out after 30ms/,
233+
);
234+
});
235+
236+
it("OPENAI_TIMEOUT_MS wins when both are set (back-compat)", async () => {
237+
process.env["OPENAI_TIMEOUT_MS"] = "30";
238+
// Set the global to a much larger value — if precedence is wrong,
239+
// we'd time out at 5000ms and the test would hang past the 5s
240+
// vitest default. We assert the message ms to lock the precedence.
241+
process.env["AGENTMEMORY_LLM_TIMEOUT_MS"] = "5000";
242+
const provider = new OpenAIProvider("test-key", "gpt-4o-mini", 1024);
243+
await expect(provider.compress("system", "user")).rejects.toThrow(
244+
/timed out after 30ms/,
245+
);
246+
});
247+
248+
it("falls back to the 60 000 ms default when neither is set", () => {
249+
// We don't actually wait 60s — the provider stores timeoutMs at
250+
// construction. Construct, then assert the bound via the error
251+
// message after the hang aborts at a tiny pre-set value.
252+
const provider = new OpenAIProvider("test-key", "gpt-4o-mini", 1024);
253+
// Access the resolved timeout via the constructed field name. The
254+
// class keeps `timeoutMs` private; reaching in via the index
255+
// access keeps the test on the public observed behaviour: the ms
256+
// value reported in the timeout error message must be 60000.
257+
const ms = (provider as unknown as { timeoutMs: number }).timeoutMs;
258+
expect(ms).toBe(60_000);
259+
});
260+
261+
it("rejects malformed env values like '30ms' or '1_000' (CodeRabbit catch)", () => {
262+
// parseInt would have silently returned 30 / 1 for these typos —
263+
// strict parse now rejects them and the provider falls back to
264+
// the 60 000 ms default so a malformed env doesn't masquerade as
265+
// an aggressive bound.
266+
// Whitespace-only padding (" 30 ") is legitimate env-var handling — we
267+
// trim before validating. The cases below are real typos parseInt would
268+
// silently swallow.
269+
for (const bad of ["30ms", "1_000", "60s", "30abc", "-30", "0"]) {
270+
process.env["OPENAI_TIMEOUT_MS"] = bad;
271+
const provider = new OpenAIProvider("test-key", "gpt-4o-mini", 1024);
272+
const ms = (provider as unknown as { timeoutMs: number }).timeoutMs;
273+
expect(ms).toBe(60_000);
274+
delete process.env["OPENAI_TIMEOUT_MS"];
275+
}
276+
});
277+
});
278+

0 commit comments

Comments
 (0)