Skip to content

Commit bc71ff6

Browse files
committed
feat(native): expose bridge decisions and probes
1 parent ebcb16d commit bc71ff6

20 files changed

Lines changed: 1278 additions & 20 deletions

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ sshpass2.sh
77
deploy-us.py
88
scripts/*
99
!scripts/native-bridge-smoke.mjs
10+
!scripts/lsp-capacity-matrix.mjs
11+
!scripts/web-search-direct-probe.mjs
1012
src/get-token.js
1113
src/test-cascade.js
1214
src/runtime-config.json

docs/native-bridge-protocol-notes.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,38 @@ separate first-party API bridge until we find the LS-side web executor
180180
precondition. The confirmed protobuf fields are useful for tracing and future
181181
matrix work, but not sufficient for production native bridge rollout.
182182

183+
## Direct Web Search API
184+
185+
`GetWebSearchResults` is confirmed independently of the LS-native tool path:
186+
187+
```text
188+
POST /exa.api_server_pb.ApiServerService/GetWebSearchResults
189+
```
190+
191+
Request fields from the descriptor dump:
192+
193+
- `metadata` = field `1`
194+
- `query` = field `2`
195+
- `limit` = field `3`
196+
- `domain` = field `4`
197+
- `third_party_config` = field `5`
198+
- `mode` = field `6`
199+
200+
Response fields:
201+
202+
- `results` = repeated field `1` (`KnowledgeBaseItem`)
203+
- `web_search_url` = field `2`
204+
- `summary` = field `3`
205+
206+
`src/windsurf-api.js` exposes `getWebSearchResults()` and
207+
`npm run probe:web-search` exercises it against real accounts. This is the
208+
preferred WebSearch investigation path for now because it avoids the LS native
209+
web executor that currently returns `permission_denied`.
210+
211+
There is not yet an equivalent confirmed direct WebFetch/read-url endpoint.
212+
Do not implement WebFetch direct bridging from guesswork; keep it on emulation
213+
or native lab traces until a descriptor-backed endpoint is found.
214+
183215
## Experiment Hooks
184216

185217
`WINDSURFAPI_NATIVE_TOOL_BRIDGE_CONFIG_RAW` can inject exact protobuf bytes
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
## v2.0.127 - Native bridge observability and probe tooling
2+
3+
- `/health?verbose=1` now exposes sanitized native bridge decision telemetry:
4+
total enabled/disabled decisions, reason counters, last decision, and a small
5+
recent-decision ring. It records why a request did or did not use the native
6+
bridge without storing caller keys, account IDs, or upstream API keys.
7+
- The authenticated dashboard overview API now includes the same sanitized
8+
native bridge telemetry, and the overview UI shows mode, gray gates, decision
9+
totals, top disable reasons, and recent mapped/unmapped tool decisions. This
10+
makes "why did this request stay on prompt emulation?" visible without
11+
reading logs or exposing the server API key to the browser.
12+
- `npm run smoke:native-bridge` now includes native bridge decision deltas in
13+
its JSON output, so Read/Grep/Glob canaries can prove both the routing path
14+
and the emitted tool-call path.
15+
- Added `npm run smoke:lsp-matrix` for real deployment LSP capacity probes. It
16+
runs configurable chat concurrency, snapshots `/health?verbose=1`, and reports
17+
RSS, LS pool occupancy, memory-guard state, and admission-stat deltas.
18+
- Added a direct `GetWebSearchResults` helper and `npm run probe:web-search`.
19+
The probe uses explicit upstream account keys or persisted `accounts.json`;
20+
it intentionally does not treat the gateway `API_KEY` as a Windsurf account
21+
key. This is the safe path for WebSearch investigation while LS-native
22+
WebSearch/WebFetch remain outside the default native bridge allowlist.
23+
- Default production behavior is unchanged: the native bridge still requires
24+
explicit env gates, and WebSearch/WebFetch are still lab-only until live
25+
LS-native result payloads are confirmed.
26+
27+
Verification:
28+
29+
- `node --check src\cascade-native-bridge.js`
30+
- `node --check src\native-bridge-stats.js`
31+
- `node --check src\handlers\chat.js`
32+
- `node --check src\windsurf-api.js`
33+
- `node --check scripts\native-bridge-smoke.mjs`
34+
- `node --check scripts\lsp-capacity-matrix.mjs`
35+
- `node --check scripts\web-search-direct-probe.mjs`
36+
- `node --test --test-timeout=120000 --test-force-exit test\*.test.js`

package.json

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
{
22
"name": "windsurf-api",
3-
"version": "2.0.126",
3+
"version": "2.0.127",
44
"description": "Windsurf to OpenAI + Anthropic compatible API proxy. Turns Windsurf's 107 AI models (Claude, GPT, Gemini, DeepSeek, Grok, Qwen, Kimi, GLM, SWE) into dual-protocol API endpoints. Zero npm deps.",
55
"type": "module",
66
"main": "src/index.js",
77
"scripts": {
88
"start": "node src/index.js",
99
"dev": "node --watch src/index.js",
1010
"test": "node --test test/*.test.js",
11-
"smoke:native-bridge": "node scripts/native-bridge-smoke.mjs"
11+
"smoke:native-bridge": "node scripts/native-bridge-smoke.mjs",
12+
"smoke:lsp-matrix": "node scripts/lsp-capacity-matrix.mjs",
13+
"probe:web-search": "node scripts/web-search-direct-probe.mjs"
1214
},
1315
"engines": {
1416
"node": ">=20.0.0"

scripts/lsp-capacity-matrix.mjs

Lines changed: 229 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,229 @@
1+
#!/usr/bin/env node
2+
3+
const baseUrl = (process.env.BASE_URL || process.env.WINDSURFAPI_BASE_URL || 'http://127.0.0.1:3003').replace(/\/+$/, '');
4+
const apiKey = process.env.API_KEY || process.env.WINDSURFAPI_API_KEY || '';
5+
const model = process.env.MODEL || process.env.WINDSURFAPI_LSP_MATRIX_MODEL || 'claude-haiku-4.5';
6+
const concurrencyValues = String(process.env.LSP_MATRIX_CONCURRENCY || '1,2,4')
7+
.split(',')
8+
.map(s => Number(s.trim()))
9+
.filter(n => Number.isFinite(n) && n > 0);
10+
const rounds = Math.max(1, Number(process.env.LSP_MATRIX_ROUNDS || 1));
11+
const timeoutMs = Math.max(5_000, Number(process.env.LSP_MATRIX_TIMEOUT_MS || 90_000));
12+
const stream = process.env.LSP_MATRIX_STREAM === '1';
13+
const includeHealth = process.env.LSP_MATRIX_HEALTH !== '0';
14+
const failFast = process.env.LSP_MATRIX_FAIL_FAST === '1';
15+
const marker = `LSP_MATRIX_${Date.now().toString(36)}`;
16+
17+
if (!apiKey) {
18+
console.error('API_KEY is required');
19+
process.exit(2);
20+
}
21+
if (!concurrencyValues.length) {
22+
console.error('LSP_MATRIX_CONCURRENCY must contain at least one positive integer');
23+
process.exit(2);
24+
}
25+
26+
function sleep(ms) {
27+
return new Promise(resolve => setTimeout(resolve, ms));
28+
}
29+
30+
function compactText(text, max = 800) {
31+
const s = String(text || '').replace(/\s+/g, ' ').trim();
32+
return s.length > max ? `${s.slice(0, max)}...<truncated ${s.length - max} chars>` : s;
33+
}
34+
35+
function percentile(values, p) {
36+
const nums = values.filter(n => Number.isFinite(n)).sort((a, b) => a - b);
37+
if (!nums.length) return 0;
38+
const idx = Math.min(nums.length - 1, Math.max(0, Math.ceil((p / 100) * nums.length) - 1));
39+
return nums[idx];
40+
}
41+
42+
function summarizePool(health) {
43+
const pool = health?.lsPool?.pool || {};
44+
const guard = pool.memoryGuard || health?.lsPool?.memoryGuard || {};
45+
return {
46+
running: !!health?.lsPool?.running,
47+
maxInstances: health?.lsPool?.maxInstances ?? pool.maxInstances ?? null,
48+
totalRssBytes: health?.lsPool?.totalRssBytes ?? null,
49+
size: pool.size ?? null,
50+
occupancy: pool.occupancy ?? null,
51+
effectiveOccupancy: pool.effectiveOccupancy ?? null,
52+
ready: pool.ready ?? null,
53+
starting: pool.starting ?? null,
54+
pending: pool.pending ?? null,
55+
reservedPendingStarts: pool.reservedPendingStarts ?? null,
56+
stopping: pool.stopping ?? null,
57+
activeRequests: pool.activeRequests ?? null,
58+
maintenanceRequests: pool.maintenanceRequests ?? null,
59+
nonDefaultInstances: pool.nonDefaultInstances ?? null,
60+
canStartNewNonDefault: pool.canStartNewNonDefault ?? null,
61+
blockReason: pool.blockReason ?? null,
62+
memoryGuard: {
63+
enabled: guard.enabled ?? null,
64+
availableBytes: guard.availableBytes ?? null,
65+
minAvailableBytes: guard.minAvailableBytes ?? null,
66+
estimatedRssBytesPerInstance: guard.estimatedRssBytesPerInstance ?? null,
67+
okToSpawn: guard.okToSpawn ?? null,
68+
source: guard.minAvailableBytesSource ?? null,
69+
},
70+
admissionStats: health?.lsPool?.admissionStats || null,
71+
};
72+
}
73+
74+
async function fetchHealth(label) {
75+
if (!includeHealth) return null;
76+
try {
77+
const res = await fetch(`${baseUrl}/health?verbose=1`, {
78+
headers: { authorization: `Bearer ${apiKey}` },
79+
});
80+
const text = await res.text();
81+
let json;
82+
try { json = JSON.parse(text); } catch {
83+
return { ok: false, label, status: res.status, error: 'health returned non-JSON', rawPreview: compactText(text) };
84+
}
85+
return {
86+
ok: res.ok,
87+
label,
88+
status: res.status,
89+
version: json.version,
90+
commit: json.commit,
91+
accounts: json.accounts || null,
92+
nativeBridgeConfig: json.nativeBridgeConfig || null,
93+
pool: summarizePool(json),
94+
};
95+
} catch (error) {
96+
return { ok: false, label, error: String(error?.message || error) };
97+
}
98+
}
99+
100+
function requestBody(user, index) {
101+
return {
102+
model,
103+
stream,
104+
user,
105+
max_tokens: 16,
106+
messages: [
107+
{ role: 'user', content: `Reply exactly OK. ${marker} request ${index}.` },
108+
],
109+
};
110+
}
111+
112+
async function runOne(user, index) {
113+
const controller = new AbortController();
114+
const timer = setTimeout(() => controller.abort(), timeoutMs);
115+
const started = Date.now();
116+
try {
117+
const res = await fetch(`${baseUrl}/v1/chat/completions`, {
118+
method: 'POST',
119+
signal: controller.signal,
120+
headers: {
121+
authorization: `Bearer ${apiKey}`,
122+
'content-type': 'application/json',
123+
},
124+
body: JSON.stringify(requestBody(user, index)),
125+
});
126+
const text = await res.text();
127+
const latencyMs = Date.now() - started;
128+
const ok = res.status >= 200 && res.status < 300;
129+
return {
130+
ok,
131+
status: res.status,
132+
latencyMs,
133+
processingMs: Number(res.headers.get('openai-processing-ms') || 0) || null,
134+
accountLikeHeaders: {
135+
model: res.headers.get('openai-model') || null,
136+
requestId: res.headers.get('x-request-id') || null,
137+
},
138+
preview: ok ? compactText(text, 240) : compactText(text, 800),
139+
};
140+
} catch (error) {
141+
return {
142+
ok: false,
143+
status: 0,
144+
latencyMs: Date.now() - started,
145+
error: error?.name === 'AbortError' ? `timeout after ${timeoutMs}ms` : String(error?.message || error),
146+
};
147+
} finally {
148+
clearTimeout(timer);
149+
}
150+
}
151+
152+
function admissionDelta(before, after) {
153+
const b = before?.pool?.admissionStats || {};
154+
const a = after?.pool?.admissionStats || {};
155+
const keys = ['startAttempts', 'startSuccesses', 'startFailures', 'poolWaits', 'memoryWaits', 'poolExhausted', 'memoryGuardBlocks', 'evictions'];
156+
const out = {};
157+
for (const key of keys) out[key] = Number(a[key] || 0) - Number(b[key] || 0);
158+
return out;
159+
}
160+
161+
function summarizeBatch(concurrency, round, before, after, requests) {
162+
const latencies = requests.map(r => r.latencyMs);
163+
const ok = requests.filter(r => r.ok).length;
164+
const statuses = {};
165+
for (const r of requests) statuses[String(r.status)] = (statuses[String(r.status)] || 0) + 1;
166+
const beforePool = before?.pool || {};
167+
const afterPool = after?.pool || {};
168+
return {
169+
round,
170+
concurrency,
171+
ok: ok === requests.length,
172+
success: ok,
173+
failed: requests.length - ok,
174+
statuses,
175+
latencyMs: {
176+
min: Math.min(...latencies),
177+
p50: percentile(latencies, 50),
178+
p95: percentile(latencies, 95),
179+
max: Math.max(...latencies),
180+
},
181+
rssDeltaBytes: Number(afterPool.totalRssBytes || 0) - Number(beforePool.totalRssBytes || 0),
182+
poolBefore: beforePool,
183+
poolAfter: afterPool,
184+
admissionDelta: admissionDelta(before, after),
185+
failures: requests.filter(r => !r.ok).slice(0, 10),
186+
};
187+
}
188+
189+
const matrix = [];
190+
const failures = [];
191+
const overallBefore = await fetchHealth('overall-before');
192+
193+
for (const concurrency of concurrencyValues) {
194+
for (let round = 1; round <= rounds; round++) {
195+
const before = await fetchHealth(`c${concurrency}-r${round}-before`);
196+
const tasks = [];
197+
for (let i = 0; i < concurrency; i++) {
198+
const user = `${marker}-c${concurrency}-r${round}-u${i}`;
199+
tasks.push(runOne(user, `${concurrency}.${round}.${i}`));
200+
}
201+
const requests = await Promise.all(tasks);
202+
await sleep(Number(process.env.LSP_MATRIX_SETTLE_MS || 1000));
203+
const after = await fetchHealth(`c${concurrency}-r${round}-after`);
204+
const row = summarizeBatch(concurrency, round, before, after, requests);
205+
matrix.push(row);
206+
if (!row.ok) failures.push(`c=${concurrency} r=${round} failed=${row.failed}`);
207+
if (failFast && failures.length) break;
208+
}
209+
if (failFast && failures.length) break;
210+
}
211+
212+
const overallAfter = await fetchHealth('overall-after');
213+
214+
console.log(JSON.stringify({
215+
ok: failures.length === 0,
216+
baseUrl,
217+
model,
218+
marker,
219+
stream,
220+
timeoutMs,
221+
concurrencyValues,
222+
rounds,
223+
failures,
224+
overallBefore,
225+
overallAfter,
226+
matrix,
227+
}, null, 2));
228+
229+
if (failures.length) process.exit(1);

scripts/native-bridge-smoke.mjs

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -584,6 +584,39 @@ function assertLsBudgetAvailable(health) {
584584
});
585585
}
586586

587+
function counterDelta(before = {}, after = {}) {
588+
const keys = new Set([...Object.keys(before || {}), ...Object.keys(after || {})]);
589+
const out = {};
590+
for (const key of keys) {
591+
const delta = Number(after?.[key] || 0) - Number(before?.[key] || 0);
592+
if (delta) out[key] = delta;
593+
}
594+
return out;
595+
}
596+
597+
function nativeBridgeDecisionDelta(before, after) {
598+
const b = before?.nativeBridge || {};
599+
const a = after?.nativeBridge || {};
600+
const recent = Array.isArray(a.recentDecisions) ? a.recentDecisions.slice(-8) : [];
601+
return {
602+
decisions: Number(a.decisions || 0) - Number(b.decisions || 0),
603+
enabledDecisions: Number(a.enabledDecisions || 0) - Number(b.enabledDecisions || 0),
604+
disabledDecisions: Number(a.disabledDecisions || 0) - Number(b.disabledDecisions || 0),
605+
reasons: counterDelta(b.decisionReasons || {}, a.decisionReasons || {}),
606+
lastDecision: a.lastDecision || null,
607+
recentDecisions: recent.map(d => ({
608+
at: d.at,
609+
enabled: !!d.enabled,
610+
reason: d.reason || '',
611+
modelKey: d.modelKey || '',
612+
route: d.route || '',
613+
mappedTools: d.mappedTools || [],
614+
unmappedTools: d.unmappedTools || [],
615+
toolChoiceFiltered: !!d.toolChoiceFiltered,
616+
})),
617+
};
618+
}
619+
587620
const selected = expandScenarios(requestedScenarios);
588621
if (!selected.length) {
589622
console.error(`No valid scenarios selected. Use one or more of: ${Object.keys(SCENARIOS).join(',')},all`);
@@ -641,6 +674,7 @@ console.log(JSON.stringify({
641674
scenarios: selected,
642675
results,
643676
failures,
677+
nativeBridgeDecisionDelta: nativeBridgeDecisionDelta(healthBefore, healthAfter),
644678
healthBefore,
645679
healthAfter,
646680
}, null, 2));

0 commit comments

Comments
 (0)