Skip to content

Commit 06fd2bc

Browse files
whoabuddyclaude
andauthored
feat(openrouter): add response validation and harden error handling (#81)
* feat(openrouter): add runtime response validation at service boundary Add three TypeScript assertion functions that validate OpenRouter API responses before they propagate to consumers, replacing silent type casts with explicit runtime checks: - validateModelsResponse(): ensures .data is an array with .id (string) and .pricing.prompt/.completion (strings) on each model - validateChatResponse(): ensures .choices is an array and .usage (if present) has numeric token fields - validateStreamChunk(): ensures .usage (if present) has numeric prompt_tokens, completion_tokens, total_tokens fields All three validators throw OpenRouterError(message, 502) on failure so callers get a clear upstream error rather than undefined access issues. Validators are applied immediately after each response.json() cast in getModels(), createChatCompletion(), and createUsageCapturingStream(). Stream chunk validation failures are caught and logged as warnings to avoid breaking active streams. OpenRouterError class moved above validators to resolve declaration ordering (validators reference the class in their throw statements). Co-Authored-By: Claude <noreply@anthropic.com> * fix(openrouter): guard response.data and model.pricing in list-models endpoint Add belt-and-suspenders Array.isArray check on response.data and per-model pricing type guards inside the .map() callback. Invalid models are filtered out (partial data) instead of crashing the endpoint, making the listing resilient to any future validator contract changes. Co-Authored-By: Claude <noreply@anthropic.com> * fix(openrouter): validate non-empty choices before returning chat completion Guard response.choices with Array.isArray and length > 0 check after the service call. An empty choices array from OpenRouter now returns a structured 502 error instead of passing through a useless response. Also logs a warning when the first choice has unexpectedly empty content. Co-Authored-By: Claude <noreply@anthropic.com> * fix(model-cache): add per-model pricing guards in doRefresh cache population Guard modelsResponse.data with Array.isArray before iterating, and add explicit type checks on each model.pricing before parseFloat(). A single malformed model now skips with a debug log rather than aborting the entire cache refresh. This makes the cache resilient to future validator contract changes. Co-Authored-By: Claude <noreply@anthropic.com> * feat(model-cache): add getCacheStatus, getSimilarModels, and degraded flag Add getCacheStatus() to expose observable cache state (warm/cold/degraded) so callers and operators can distinguish between a healthy cache, a never- populated cold start, and a failed-fetch degraded state. Add getSimilarModels() for use in invalid-model error responses — returns up to N model IDs from the registry that share a provider prefix with the requested model, so agents get actionable correction hints. Update ModelLookupResult to carry an optional degraded flag when the registry is empty after a refresh attempt, allowing consumers to log and surface the degraded state rather than silently allowing the request. Co-Authored-By: Claude <noreply@anthropic.com> * feat(chat): add post-payment model validation with similar model suggestions The x402 middleware rejects invalid models pre-payment, but when the cache is cold/degraded at middleware time the check is skipped. Adding a second lookupModel() call in the chat handler catches this case once the cache has been populated, returning a 400 with code: "invalid_model", the rejected model ID, and up to 3 provider-prefix-matched suggestions from the live registry so clients get actionable correction hints. Co-Authored-By: Claude <noreply@anthropic.com> * fix(x402): include model ID in invalid_model rejection and log degraded state The pre-payment model rejection now includes the model field alongside the error message and code so clients and logs always have the rejected model ID without having to reconstruct it from the request body. When lookupModel returns valid:true/degraded:true the middleware now logs a warning instead of silently passing through, giving operators visibility when the model registry cache was unavailable at validation time. Co-Authored-By: Claude <noreply@anthropic.com> * test(openrouter): add unit tests for response validation helpers Adds tests/openrouter-validation.unit.test.ts covering all three validator functions (validateModelsResponse, validateChatResponse, validateStreamChunk). Tests verify both happy paths and every error branch, ensuring the service-boundary validators reject malformed OpenRouter responses with OpenRouterError status 502. Closes #80 Co-Authored-By: Claude <noreply@anthropic.com> * refactor(openrouter): simplify validation code and remove redundant guards Extract shared usage validation helper, remove belt-and-suspenders guards that duplicate Phase 1 validator guarantees, and reduce test boilerplate with a shared assertion helper. Net -138 lines, same coverage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address PR review feedback from Copilot and Arc - Add null/non-object element guard in validateModelsResponse - Rewrite expectOpenRouterError to invoke fn once (not twice) - Add null/non-object element test cases for models validator - Remove redundant per-model pricing guard in doRefresh (strict validator handles it) - Add clarifying comment on getSimilarModels fallback behavior - Add unit tests for getCacheStatus and getSimilarModels via test helpers - Export _seedCacheForTesting/_resetCacheForTesting for test isolation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent a984d93 commit 06fd2bc

7 files changed

Lines changed: 741 additions & 26 deletions

File tree

src/endpoints/inference/openrouter/chat.ts

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
import { BaseEndpoint } from "../../base";
99
import { OpenRouterClient, OpenRouterError } from "../../../services/openrouter";
1010
import { logPnL } from "../../../services/pricing";
11+
import { lookupModel, getSimilarModels } from "../../../services/model-cache";
1112
import { tokenTypeParam } from "../../schema";
1213
import type { AppContext, ChatCompletionRequest, UsageRecord } from "../../../types";
1314

@@ -128,6 +129,27 @@ export class OpenRouterChat extends BaseEndpoint {
128129
return this.errorResponse(c, "model and messages are required", 400);
129130
}
130131

132+
// Belt-and-suspenders model validation: middleware handles the primary rejection
133+
// pre-payment, but validate again here in case the cache was degraded at that time
134+
// and has since been populated, or in case the middleware was bypassed.
135+
const modelResult = await lookupModel(request.model, c.env.OPENROUTER_API_KEY, log);
136+
if (modelResult.valid && modelResult.degraded) {
137+
log.warn("Model cache degraded at chat handler — cannot confirm model validity", {
138+
model: request.model,
139+
});
140+
} else if (!modelResult.valid) {
141+
const suggestions = getSimilarModels(request.model, 3);
142+
return c.json(
143+
{
144+
error: modelResult.error,
145+
code: "invalid_model",
146+
model: request.model,
147+
...(suggestions.length > 0 ? { suggestions } : {}),
148+
},
149+
400
150+
);
151+
}
152+
131153
const client = new OpenRouterClient(c.env.OPENROUTER_API_KEY, log);
132154

133155
try {
@@ -189,6 +211,23 @@ export class OpenRouterChat extends BaseEndpoint {
189211
const { response, usage } = await client.createChatCompletion(request);
190212
const durationMs = Date.now() - startTime;
191213

214+
// validateChatResponse guarantees .choices is an array, but it may be empty.
215+
if (response.choices.length === 0) {
216+
log.error("OpenRouter returned no choices", {
217+
model: response.model || request.model,
218+
});
219+
return this.errorResponse(c, "OpenRouter returned no choices", 502);
220+
}
221+
222+
// Log a warning if the first choice has empty content (valid but unexpected).
223+
const firstChoice = response.choices[0];
224+
if (firstChoice?.message?.content === "") {
225+
log.warn("OpenRouter returned empty content in first choice", {
226+
model: response.model || request.model,
227+
finishReason: firstChoice.finish_reason,
228+
});
229+
}
230+
192231
// Log PnL
193232
if (x402.priceEstimate) {
194233
logPnL(

src/endpoints/inference/openrouter/list-models.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,8 @@ export class OpenRouterListModels extends FreeEndpoint {
6262
const client = new OpenRouterClient(c.env.OPENROUTER_API_KEY, log);
6363

6464
try {
65+
// getModels() runs validateModelsResponse() which guarantees .data is an
66+
// array and every model has .id (string) and .pricing with string fields.
6567
const response = await client.getModels();
6668

6769
const models = response.data.map((model) => ({

src/middleware/x402.ts

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -248,8 +248,16 @@ export function x402Middleware(
248248
// Pre-payment model validation: reject unknown models before issuing 402
249249
if (c.env.OPENROUTER_API_KEY) {
250250
const modelResult = await lookupModel(chatRequest.model, c.env.OPENROUTER_API_KEY, log);
251-
if (!modelResult.valid) {
252-
return c.json({ error: modelResult.error, code: "invalid_model" }, 400);
251+
if (modelResult.valid && modelResult.degraded) {
252+
// Cache was empty after refresh attempt — allow the request but warn operators
253+
log.warn("Model cache degraded at middleware — skipping pre-payment model validation", {
254+
model: chatRequest.model,
255+
});
256+
} else if (!modelResult.valid) {
257+
return c.json(
258+
{ error: modelResult.error, code: "invalid_model", model: chatRequest.model },
259+
400
260+
);
253261
}
254262
// Use live registry pricing if available, otherwise fall through to hardcoded table
255263
priceEstimate = estimateChatPayment(chatRequest, tokenType, log, modelResult.pricing);

src/services/model-cache.ts

Lines changed: 124 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,24 @@ const FETCH_TIMEOUT_MS = 3_000;
2626

2727
/** Discriminated union result from lookupModel */
2828
export type ModelLookupResult =
29-
| { valid: true; pricing?: ModelPricing }
29+
| { valid: true; pricing?: ModelPricing; degraded?: true }
3030
| { valid: false; error: string };
3131

32+
/** Cache state reported by getCacheStatus() */
33+
export type CacheState = "warm" | "cold" | "degraded";
34+
35+
/** Cache status returned by getCacheStatus() */
36+
export interface CacheStatus {
37+
/** warm = populated and fresh; cold = never fetched or empty; degraded = last fetch failed */
38+
state: CacheState;
39+
/** Number of models currently in the registry */
40+
modelCount: number;
41+
/** Timestamp (ms since epoch) of the last successful fetch, or null if never fetched */
42+
lastRefreshed: number | null;
43+
/** Timestamp (ms since epoch) of the last failed fetch attempt, or null if no failures */
44+
lastFailedAt: number | null;
45+
}
46+
3247
// =============================================================================
3348
// Module-level Cache
3449
// =============================================================================
@@ -91,6 +106,8 @@ async function doRefresh(apiKey: string, logger: Logger): Promise<void> {
91106

92107
try {
93108
const client = new OpenRouterClient(apiKey, logger);
109+
// getModels() runs validateModelsResponse() which guarantees .data is an
110+
// array and every model has .id (string) and .pricing with string fields.
94111
const modelsResponse = await client.getModels(controller.signal);
95112

96113
modelRegistry.clear();
@@ -127,6 +144,78 @@ async function doRefresh(apiKey: string, logger: Logger): Promise<void> {
127144
// Public API
128145
// =============================================================================
129146

147+
/**
148+
* Returns the current cache state without triggering a refresh.
149+
*
150+
* States:
151+
* "warm" — registry populated and TTL not expired
152+
* "cold" — never fetched successfully (fetchedAt is null) or registry is empty with no prior failure
153+
* "degraded" — last fetch attempt failed and registry may be empty or stale
154+
*/
155+
export function getCacheStatus(): CacheStatus {
156+
const modelCount = modelRegistry.size;
157+
158+
let state: CacheState;
159+
160+
if (lastFailedAt !== null && (modelCount === 0 || (fetchedAt !== null && Date.now() - fetchedAt > CACHE_TTL_MS))) {
161+
state = "degraded";
162+
} else if (fetchedAt !== null && modelCount > 0 && Date.now() - fetchedAt <= CACHE_TTL_MS) {
163+
state = "warm";
164+
} else {
165+
state = "cold";
166+
}
167+
168+
return {
169+
state,
170+
modelCount,
171+
lastRefreshed: fetchedAt,
172+
lastFailedAt,
173+
};
174+
}
175+
176+
/**
177+
* Find model IDs in the registry that are similar to the given model ID.
178+
*
179+
* Similarity strategy:
180+
* 1. If modelId contains "/", try to match other models with the same provider prefix.
181+
* 2. If no prefix matches found, fall back to lexicographic prefix match on the full ID.
182+
* 3. Returns at most maxResults results.
183+
*/
184+
export function getSimilarModels(modelId: string, maxResults = 3): string[] {
185+
if (modelRegistry.size === 0) {
186+
return [];
187+
}
188+
189+
const allModels = Array.from(modelRegistry.keys()).sort();
190+
191+
// Try provider prefix match (e.g., "openai/" from "openai/gpt-4o")
192+
const slashIdx = modelId.indexOf("/");
193+
if (slashIdx !== -1) {
194+
const providerPrefix = modelId.slice(0, slashIdx + 1);
195+
const providerMatches = allModels.filter(
196+
(id) => id.startsWith(providerPrefix) && id !== modelId
197+
);
198+
if (providerMatches.length > 0) {
199+
return providerMatches.slice(0, maxResults);
200+
}
201+
}
202+
203+
// Fall back: full-string prefix match (e.g., "gpt" matches "gpt-4")
204+
const prefixLen = Math.max(3, Math.floor(modelId.length / 2));
205+
const prefix = modelId.slice(0, prefixLen).toLowerCase();
206+
const prefixMatches = allModels.filter(
207+
(id) => id.toLowerCase().startsWith(prefix) && id !== modelId
208+
);
209+
if (prefixMatches.length > 0) {
210+
return prefixMatches.slice(0, maxResults);
211+
}
212+
213+
// No structural match — return first maxResults models as fallback hints.
214+
// These may be unrelated; callers should treat them as "here are some valid models"
215+
// rather than "here are close matches."
216+
return allModels.filter((id) => id !== modelId).slice(0, maxResults);
217+
}
218+
130219
/**
131220
* Look up a model by ID, refreshing the cache if stale.
132221
*
@@ -147,10 +236,10 @@ export async function lookupModel(
147236
await refreshCache(apiKey, logger);
148237
}
149238

150-
// If the cache is still empty (e.g., fetch failed), be permissive
239+
// If the cache is still empty (e.g., fetch failed), be permissive but signal degraded state
151240
if (modelRegistry.size === 0) {
152-
logger.debug("Model cache empty after refresh attempt -- allowing request", { modelId });
153-
return { valid: true };
241+
logger.debug("Model cache empty after refresh attempt -- allowing request (degraded)", { modelId });
242+
return { valid: true, degraded: true };
154243
}
155244

156245
const cached = modelRegistry.get(modelId);
@@ -164,3 +253,34 @@ export async function lookupModel(
164253

165254
return { valid: true, pricing: cached };
166255
}
256+
257+
// =============================================================================
258+
// Test Helpers (not part of public API)
259+
// =============================================================================
260+
261+
/**
262+
* Seed the model registry with test data and reset internal state.
263+
* Exported for unit tests only — not intended for production use.
264+
*/
265+
export function _seedCacheForTesting(
266+
models: Array<{ id: string; pricing: ModelPricing }>,
267+
options?: { simulateFailure?: boolean }
268+
): void {
269+
modelRegistry.clear();
270+
for (const m of models) {
271+
modelRegistry.set(m.id, m.pricing);
272+
}
273+
fetchedAt = models.length > 0 ? Date.now() : null;
274+
lastFailedAt = options?.simulateFailure ? Date.now() : null;
275+
}
276+
277+
/**
278+
* Reset the cache to its initial empty state.
279+
* Exported for unit tests only — not intended for production use.
280+
*/
281+
export function _resetCacheForTesting(): void {
282+
modelRegistry.clear();
283+
fetchedAt = null;
284+
lastFailedAt = null;
285+
inflightRefresh = null;
286+
}

0 commit comments

Comments
 (0)