Skip to content

feat(api-proxy): add middle-power model fallback with stale-cache recovery#3607

Merged
lpcox merged 2 commits into
mainfrom
copilot/feat-api-proxy-middle-power-fallback
May 22, 2026
Merged

feat(api-proxy): add middle-power model fallback with stale-cache recovery#3607
lpcox merged 2 commits into
mainfrom
copilot/feat-api-proxy-middle-power-fallback

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 22, 2026

Model resolution in api-proxy could fail when aliases miss, requested models are unavailable, or cache is stale, causing upstream 4xx loops and repeated retries. This change adds a default safety-net fallback that deterministically selects an available “middle-power” model and makes fallback behavior configurable via AWF config.

  • Model resolution fallback policy

    • Adds middle-power median selection when normal resolution paths are exhausted.
    • Introduces provider-aware capability tier sorting:
      • Anthropic: opus > sonnet > haiku
      • OpenAI/Copilot: gpt-5.x > gpt-4.x > gpt-3.5
      • Unknown families/providers: lexicographic median
    • Supports per-alias override with extended alias shape:
      • legacy: "alias": ["provider/pattern"]
      • extended: "alias": { "patterns": [...], "fallback": false }
  • Stale cache / availability recovery

    • On unresolved selection (or fallback activation), proxy now performs on-demand provider model refresh before final rewrite decision.
    • Body transforms were updated to support async execution so refresh can occur inline.
    • Ensures final selection prefers currently available provider models instead of stale cache entries.
  • Structured fallback observability

    • Adds explicit structured events:
      • model_fallback_activated (warn)
      • model_fallback_candidates (debug)
      • model_fallback_skipped (info)
    • Events include provider, original/fallback model, reason, candidate list metadata, and selection method.
  • Config + reflection wiring

    • Adds AWF config support for:
      • apiProxy.modelFallback.enabled (default true)
      • apiProxy.modelFallback.strategy (middle_power)
    • Wires config into api-proxy via AWF_MODEL_FALLBACK.
    • Exposes effective fallback config in /reflect as model_fallback.
    • Updates schema/type/mapping surfaces accordingly.
  • Illustrative behavior

{
  "apiProxy": {
    "models": {
      "sonnet": { "patterns": ["copilot/*sonnet*"], "fallback": true }
    },
    "modelFallback": {
      "enabled": true,
      "strategy": "middle_power"
    }
  }
}

When sonnet cannot be resolved from current alias candidates, api-proxy now selects a provider-available median-tier model and emits model_fallback_activated with candidate/tier context.

Copilot AI changed the title [WIP] Implement middle-power model fallback for api-proxy feat(api-proxy): add middle-power model fallback with stale-cache recovery May 22, 2026
Copilot AI requested a review from lpcox May 22, 2026 19:58
Copilot finished work on behalf of lpcox May 22, 2026 19:58
@lpcox lpcox marked this pull request as ready for review May 22, 2026 19:59
Copilot AI review requested due to automatic review settings May 22, 2026 19:59
@github-actions
Copy link
Copy Markdown
Contributor

Documentation Preview

Documentation build failed for this PR. View logs.

Built from commit 7100c97

@github-actions
Copy link
Copy Markdown
Contributor

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 95.98% 96.05% 📈 +0.07%
Statements 95.81% 95.87% 📈 +0.06%
Functions 98.02% 98.02% ➡️ +0.00%
Branches 89.44% 89.48% 📈 +0.04%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/config-writer.ts 83.0% → 85.6% (+2.54%) 83.0% → 85.6% (+2.54%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test Results

GitHub MCP: ✅ Retrieved PR #3602: "api-proxy: fallback unavailable gpt-5.x requests to highest available family model"

GitHub.com Connectivity: ❌ Pre-step data not provided (placeholder not substituted)

File Write/Read: ❌ File /tmp/smoke-test-file.txt not found

Overall Status: FAIL (2/3 tests could not be verified)

cc @Copilot @lpcox

📰 BREAKING: Report filed by Smoke Copilot

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test Results

GitHub API: 2 PR entries verified
GitHub check: playwright_check PASS
File verify: smoke-test-claude-26309084989.txt exists

Status: PASS

💥 [THE END] — Illustrated by Smoke Claude

@github-actions
Copy link
Copy Markdown
Contributor

📡 OTel Tracing Validation Results

All scenarios passed

Scenario Status Details
Module Loading otel.js loaded successfully with 11 exports including startRequestSpan, setTokenAttributes, endSpan, isEnabled
Test Suite All 32 tests passed (6 test suites covering initialization, span creation, token attributes, OTLP serialization, exporters)
Env Var Forwarding api-proxy-service.ts correctly forwards OTEL_EXPORTER_OTLP_ENDPOINT and GITHUB_AW_OTEL_TRACE_ID to api-proxy container
Token Tracker Integration onUsage callback hook point exists in token-tracker-http.js for OTEL span decoration
OTEL Diagnostics i️ No live spans exported (expected - no actual API calls were made in this smoke test run)

Implementation Summary

  • Span creation: CLIENT spans with GenAI semantic conventions (gen_ai.provider.name, gen_ai.operation.name, gen_ai.request.stream)
  • Token attributes: gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, cache tokens as awf.cached_read/awf.cached_write
  • Parent context: Reads GITHUB_AW_OTEL_TRACE_ID and GITHUB_AW_OTEL_PARENT_SPAN_ID to link api-proxy spans to workflow traces
  • Export paths:
    • Network: OTLP/HTTP via Squid proxy when OTEL_EXPORTER_OTLP_ENDPOINT is set
    • Fallback: NDJSON file at /var/log/api-proxy/otel.jsonl
  • Proxy awareness: ProxyAwareOtlpExporter uses HttpsProxyAgent for Squid routing

✅ OTEL tracing integration is production-ready.

📡 OTel tracing validated by Smoke OTel Tracing

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test Results\n- GitHub MCP Testing: ❌ (mcpscripts not found)\n- GitHub.com Connectivity: ❌ (Status 000)\n- File Writing Testing: ✅\n- Bash Tool Testing: ✅\n\nOverall status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

@github-actions
Copy link
Copy Markdown
Contributor

✅ Merged PRs: api-proxy: fallback unavailable gpt-5.x requests to highest available family model; Stabilize test-coverage-reporter by isolating main-action unit tests from DinD probing
❌ SafeInputs GH CLI: tool unavailable; fallback query returned PR title
✅ Playwright: GitHub title verified
❌ Tavily: MCP exposed no callable search tool
✅ File/Bash + discussion + build: passed
Overall status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

@github-actions
Copy link
Copy Markdown
Contributor

Chroot Runtime Version Test Results

Runtime Host Version Chroot Version Match?
Python 3.12.13 3.12.3 ❌ NO
Node.js v24.15.0 v22.22.3 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall Result: ❌ Tests FAILED

The chroot environment is not using the same runtime versions as the host. Python and Node.js versions differ between the host runner and the chroot environment, which may indicate the selective bind mounts are not properly exposing the host binaries.

Tested by Smoke Chroot

@github-actions
Copy link
Copy Markdown
Contributor

🔥 Smoke Test: Services Connectivity - FAIL

❌ Redis: connection timeout
❌ PostgreSQL (pg_isready): no response
❌ PostgreSQL (SELECT 1): connection timeout

Result: FAIL - All service connectivity checks failed. Services may not be running or host.docker.internal routing is not configured.

🔌 Service connectivity validated by Smoke Services

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a configurable “middle-power” model fallback policy to the API proxy to avoid repeated upstream 4xx/retry loops when model alias resolution fails or cached availability is stale, and exposes this policy through config + /reflect.

Changes:

  • Introduces apiProxy.modelFallback config (schema + mapping) and passes it to the api-proxy sidecar via AWF_MODEL_FALLBACK.
  • Updates api-proxy model resolution to support a deterministic middle-tier fallback and performs an on-demand models refresh when resolution fails / fallback activates.
  • Makes body transforms async-capable (composition + proxy request path) and adds reflect/test coverage for the new behavior.
Show a summary per file
File Description
src/types/api-proxy-options.ts Adds modelFallback option type to wrapper config surface.
src/services/api-proxy-service.ts Wires AWF_MODEL_FALLBACK env var into sidecar container env.
src/services/api-proxy-service-rate-limit.test.ts Adds compose/env assertion test for AWF_MODEL_FALLBACK.
src/config-file.ts Adds config typing + maps apiProxy.modelFallback into CLI options.
src/config-file.test.ts Adds schema validation and mapping tests for modelFallback.
src/commands/build-config.ts Plumbs options.modelFallback into constructed WrapperConfig.
src/awf-config-schema.json Adds JSON schema for apiProxy.modelFallback.
docs/awf-config.schema.json Mirrors schema change for published docs schema.
containers/api-proxy/server.network.test.js Asserts /reflect includes model_fallback.
containers/api-proxy/server.models.test.js Adds tests for stale-cache refresh, fallback logs, and async transform composition.
containers/api-proxy/server.js Implements fallback config parsing, on-demand model refresh, and fallback observability logs.
containers/api-proxy/proxy-utils.js Extends composeBodyTransforms to support async transforms.
containers/api-proxy/proxy-request.js Awaits (possibly async) body transforms when proxying requests.
containers/api-proxy/model-resolver.test.js Adds coverage for extended alias parsing + fallback selection behavior.
containers/api-proxy/model-resolver.js Implements middle-power fallback selection + extended alias support.
containers/api-proxy/model-discovery.js Adds provider-aware tier ranking and tier-sorted model helper.
containers/api-proxy/management.js Exposes effective fallback config via reflect output.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

containers/api-proxy/model-resolver.js:34

  • parseModelAliases now accepts extended alias entries ({ patterns: string[], fallback?: boolean }), but the JSDoc still claims it returns Record<string, string[]>, and the wrapper config surfaces still only allow apiProxy.models values to be string[] (see src/awf-config-schema.json/docs/awf-config.schema.json apiProxy.models schema and src/config-file.ts type). As a result, the new extended alias shape can’t be expressed via the AWF config file/TS types despite being advertised. Update the schema/types/docs to permit the extended shape (or adjust the implementation/docs to match what’s actually configurable).
/**
 * Parse model aliases configuration from a raw JSON string.
 *
 * @param {string|null|undefined} rawConfig - JSON string from AWF_MODEL_ALIASES env var
 * @returns {{ models: Record<string, string[]> } | null} Parsed config or null if invalid/absent
 */
  • Files reviewed: 17/17 changed files
  • Comments generated: 3

Comment on lines +313 to 314
const transformed = await bodyTransform(body);
if (transformed) body = transformed;
Comment on lines +94 to +103

function parseModelFallbackConfig(rawConfig) {
if (!rawConfig) return { ...DEFAULT_MODEL_FALLBACK };
try {
const parsed = JSON.parse(rawConfig);
if (!parsed || typeof parsed !== 'object' || Array.isArray(parsed)) return { ...DEFAULT_MODEL_FALLBACK };
const enabled = parsed.enabled === undefined ? true : Boolean(parsed.enabled);
const strategy = typeof parsed.strategy === 'string' && parsed.strategy.trim()
? parsed.strategy.trim()
: DEFAULT_MODEL_FALLBACK.strategy;
Comment on lines +139 to +144
return async (body) => {
let result = rewriteModelInBody(body, provider, MODEL_ALIASES.models, cachedModels, MODEL_FALLBACK);
if (!result || (result.fallback && result.fallback.activated)) {
await refreshProviderModelsForResolution(provider);
result = rewriteModelInBody(body, provider, MODEL_ALIASES.models, cachedModels, MODEL_FALLBACK);
}
@github-actions
Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx 1/1 passed ✅ PASS
Node.js execa 1/1 passed ✅ PASS
Node.js p-limit 1/1 passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

All build and test operations completed successfully across all 8 ecosystems (Bun, C++, Deno, .NET, Go, Java, Node.js, Rust).

Generated by Build Test Suite for issue #3607 · ● 14.4M ·

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test: Copilot BYOK Mode ✅

PR #3607: feat(api-proxy): add middle-power model fallback with stale-cache recovery
Author: @Copilot | Assignees: @lpcox, @Copilot

Results

  • ✅ GitHub MCP connectivity (PR fetch successful)
  • ✅ BYOK inference (agent → api-proxy → api.githubcopilot.com)
  • ⚠️ File test: template variables not expanded
  • ⚠️ HTTP test: template variables not expanded

Note: Running in BYOK offline mode (COPILOT_OFFLINE=true) via api-proxy sidecar

Overall: PASS (core BYOK functionality verified)

🔑 BYOK report filed by Smoke Copilot BYOK

@lpcox lpcox merged commit 1d234dd into main May 22, 2026
72 of 75 checks passed
@lpcox lpcox deleted the copilot/feat-api-proxy-middle-power-fallback branch May 22, 2026 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(api-proxy): middle-power model fallback when selection criteria fail

3 participants