Skip to content

Commit c247c33

Browse files
committed
docs: record issue audit roadmap
1 parent 72e1b9c commit c247c33

4 files changed

Lines changed: 260 additions & 0 deletions

File tree

docs/MAINTAINER_NOTES.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# Maintainer Notes
2+
3+
These notes capture project operating rules that should survive context
4+
resets. They are not release notes.
5+
6+
## Evidence Rules
7+
8+
- Do not claim support from names, guesses, or encode/decode round trips. For
9+
protocol work, require descriptor evidence, LS binary field evidence, or a
10+
real redacted trace.
11+
- Do not widen production defaults from a single lab success. First add gated
12+
smoke, logs, docs, and a rollback path.
13+
- Keep unsupported boundaries explicit. If a tool, model, media input, or
14+
backend cannot be bridged safely, return a clear error instead of pretending
15+
it is OpenAI-compatible.
16+
- When an issue is broad, keep it as a reproduction bucket and require logs.
17+
Do not close it because a related bug was fixed elsewhere.
18+
19+
## Native Bridge Rules
20+
21+
- Production default native bridge scope is the Bash family only:
22+
`Bash`, `shell_command`, and `run_command`.
23+
- `Read`, `Grep`, `Glob`, `WebSearch`, and `WebFetch` are protocol-lab tools
24+
until real traces confirm argument shape, result shape, and execution
25+
boundary.
26+
- `WINDSURFAPI_NATIVE_TOOL_BRIDGE=all_mapped` is not a generic fix for "tools
27+
not called". Use it only with explicit API key, account, model, and tool
28+
gates.
29+
- Native bridge executes in the remote Windsurf workspace. Do not describe it
30+
as local IDE/MCP/client tool execution.
31+
- Keep raw proto traces redacted by default. Raw string trace switches are for
32+
gated lab runs only.
33+
34+
## SWE / Special-Agent Rules
35+
36+
- SWE-1.6 and SWE-1.6-fast are special-agent work unless a real official trace
37+
proves direct Cascade chat support.
38+
- Do not mix SWE-1.6 with ordinary cloud catalog fixes.
39+
- Devin/ACP backends must be default-off, bounded, and text-only first.
40+
- Client-local tools and media must be rejected or explicitly bridged; never
41+
silently execute them in a different workspace.
42+
43+
## WebSearch / WebFetch Rules
44+
45+
- Direct `GetWebSearchResults` is confirmed for WebSearch investigation.
46+
- No direct WebFetch/read-url API is confirmed. Do not implement one from a
47+
guessed method name.
48+
- The observed WebFetch path is LS requested interaction plus
49+
`HandleCascadeUserInteraction`, then a later trajectory step.
50+
- Do not bypass production VPS memory guards just to force a WebFetch canary.
51+
Use an isolated memory-safe lab environment.
52+
53+
## Release Rules
54+
55+
- For code releases, update `package.json`, add release notes, run the focused
56+
tests, run `npm run test:release`, run `npm run secret-scan`, and run full
57+
shards when the blast radius is not trivial.
58+
- After tag push, verify GitHub CI, Release, Docker build, and deployed VPS
59+
smoke before calling the release done.
60+
- VPS smoke should include `/health?verbose=1`, Docker image labels, `/v1/models`,
61+
and one basic chat completion.
62+
- Verify the actual WindsurfAPI entrypoint before judging VPS health. In the
63+
current VPS deployment the compose nginx entry is on `:3003`; public port 80
64+
may be served by another stack and is not a WindsurfAPI health signal.
65+
- `/health` build metadata matters. If commit is missing, fix build metadata
66+
injection instead of relying only on image labels.
67+
68+
## Security And Privacy Rules
69+
70+
- Never write raw API keys, passwords, account credentials, session tokens, or
71+
customer email lists into docs, release notes, issue comments, or logs.
72+
- Use hashes, counts, IDs, and redacted previews for diagnostics.
73+
- Run secret scan before release and before pushing documentation that touched
74+
examples or operational notes.
75+
76+
## Code And UI Rules
77+
78+
- Prefer existing local helpers and patterns. Avoid new dependencies unless the
79+
maintenance tradeoff is clearly worth it.
80+
- Keep patches scoped. Do not mix protocol reverse engineering, dashboard UI,
81+
release workflow, and unrelated cleanup in one release unless there is a real
82+
dependency.
83+
- Dashboard UI should stay operational and dense: pagination, summaries,
84+
compact tables, predictable controls, and no marketing-style layout.
85+
- Dashboard interactions should use existing app confirmation/prompt patterns,
86+
not native browser alerts.
87+
- Do not revert unrelated user or generated changes in the worktree.

docs/README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# WindsurfAPI Docs
2+
3+
High-signal operational documents:
4+
5+
- [Maintainer Notes](MAINTAINER_NOTES.md): persistent quality, release,
6+
security, native bridge, SWE, and WebFetch working rules.
7+
- [Audit 2026-06-07](audits/AUDIT_2026-06-07.md): current open issue triage,
8+
priority order, SWE-1.6 plan, and WebSearch/WebFetch plan.
9+
- [Audit 2026-06-06](audits/AUDIT_2026-06-06.md): prior hardening audit for
10+
release metadata, dashboard pagination, native bridge, and HTTP ingress.
11+
- [Native Bridge Protocol Notes](native-bridge-protocol-notes.md): protobuf and
12+
runtime trace notes for native bridge protocol work.
13+
- [Dashboard i18n](dashboard-i18n.md): dashboard localization notes.
14+
15+
Release-specific changes live under [releases](releases/).

docs/audits/AUDIT_2026-06-07.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
# Audit 2026-06-07
2+
3+
Scope: current GitHub issues, open PRs, recent release state, SWE-1.6,
4+
WebSearch/WebFetch, native bridge boundaries, and project working rules.
5+
6+
## Baseline
7+
8+
- Local and remote HEAD: `v2.0.142` (`72e1b9c`,
9+
`fix: clean partial stream error tails`).
10+
- Working tree at audit start: clean.
11+
- Open PRs: none.
12+
- VPS deployment: healthy on `v2.0.142` through the WindsurfAPI compose entry
13+
(`:3003` in the current VPS deployment). `/health?verbose=1` reports version
14+
`2.0.142` and commit `72e1b9cf079e`; authenticated `/v1/models` and a basic
15+
chat smoke returned HTTP 200 after deployment. Do not use the VPS public port
16+
80 Apache/PHP 404 page as a health signal for this service.
17+
- Recent closed issue cluster: #191, #189, #176, and #180 were closed after
18+
rate-limit / provider-deadline / cooldown handling was documented and
19+
surfaced more clearly.
20+
- Current open issues: #190, #186, #185, #183, #178, #177, and #169.
21+
22+
## Open Issue Triage
23+
24+
| Issue | Current problem | Keep open because | Next evidence needed |
25+
| --- | --- | --- | --- |
26+
| #177 | Broad "model degraded / tool failures" bucket. Causes can include model family, tool schema size, tool-choice translation, prompt emulation limits, and upstream section behavior. | v2.0.141 added `ToolRoute[...]` diagnostics, but there is no single confirmed repro left to close the bucket. | Client, route, model, tool names/count, `ToolRoute[...]`, and `Probe[...]` logs for a failing request. |
27+
| #178 | "No tools get called" reports from Kilo Code, opencode, Codex-like clients. | The proxy can now distinguish stripped tools, forced missing tools, native gate misses, compacted preambles, and model narration without tool calls; reporters still need concrete logs. | Same as #177, plus whether native bridge was off, narrow, or `all_mapped`. |
28+
| #183 | Claude Code / OpenWebUI web-search flow can lose or repeat user input after search. | WebSearch/WebFetch LS-native protocol is still lab-only; the direct WebSearch API works, but WebFetch direct endpoint is not confirmed. | A memory-safe gated WebFetch/WebSearch canary with proto trace and `webFetchTrace.state`, not a production VPS guard bypass. |
29+
| #185 | Cursor truncation and stray JSON in streamed answers. | v2.0.142 fixed one concrete post-content error JSON tail, but upstream long-stream provider deadlines can still truncate content. | Reporter retest on `v2.0.142`; if JSON still appears, capture route, stream/non-stream, and debug request logs. |
30+
| #186 | Gemini / DeepSeek model request plus SWE-1.6 mention. | Normal model additions depend on Windsurf upstream/cloud catalog; SWE-1.6 is split to #190 and must not be tracked here as normal catalog work. | Upstream catalog evidence for Gemini/DeepSeek, or a model name that is present upstream but missing locally. |
31+
| #190 | SWE-1.6 / SWE-1.6-fast works in official tools but not in direct Cascade chat. | Direct Cascade reports unknown/missing model path behavior; this should be a special-agent / Devin / ACP POC, not a normal enum/UID fix. | Devin-capable text-only smoke, then ACP initialize/auth/session/prompt validation. |
32+
| #169 | Dashboard account card display mode request. | Product requirement is underspecified and lower priority than protocol/tool correctness. | Exact desired modes, for example compact rows, grouped-by-status, or grouped-by-model. |
33+
34+
No currently open issue should be closed only from the information above. The
35+
right pattern is to add a closing comment only when the specific acceptance
36+
condition is met and a released version is available.
37+
38+
## Priority Order
39+
40+
1. Keep #177/#178/#185 as reproduction buckets and require the new diagnostics
41+
before making more tool-call claims. v2.0.141/v2.0.142 already improved
42+
observability and one streaming edge; the next work is evidence collection,
43+
not broad native-bridge enablement.
44+
2. Implement the SWE-1.6 special-agent POC as a separate backend. Start with
45+
text-only Devin CLI print mode, default off, no local client tools, no media,
46+
and bounded process/output limits.
47+
3. Continue WebSearch/WebFetch in a lab environment with enough LS memory
48+
budget. Do not bypass the production VPS memory guard to force a canary.
49+
4. Keep Read/Grep/Glob/WebSearch/WebFetch out of the default production native
50+
allowlist until protocol fields and runtime semantics are confirmed by real
51+
traces.
52+
5. Address #169 after the protocol/tool work has stable evidence. Dashboard
53+
pagination from #168 is already done; #169 is about additional view modes.
54+
6. Treat #186 as upstream/catalog watch work. Do not invent Gemini/DeepSeek
55+
support before Windsurf exposes usable upstream catalog entries.
56+
57+
## SWE-1.6 Plan
58+
59+
SWE-1.6 is not a normal catalog patch. Do not "fix" it by adding or changing a
60+
Cascade UID unless a real official trace proves that direct Cascade accepts it.
61+
62+
POC shape:
63+
64+
- `WINDSURFAPI_SPECIAL_AGENT_BACKEND=devin-cli`
65+
- `swe-1.6` and `swe-1.6-fast` remain hidden from normal `/v1/models` unless
66+
the special backend is explicitly enabled.
67+
- Initial mode is text-only. Requests with client-local tools or media should
68+
return a clear unsupported-boundary error, not silently execute in a different
69+
workspace.
70+
- Process management must have explicit max processes, timeout, output byte
71+
limit, and account/session binding before production recommendation.
72+
- ACP mode is second phase: initialize/auth/session/prompt first, then
73+
permission handling. Default permission answer should be deny/cancel until a
74+
safe mapping exists.
75+
76+
Acceptance before closing #190:
77+
78+
- A real `swe-1.6-fast` or `swe-1.6` smoke succeeds through the special-agent
79+
backend in a Devin-capable environment.
80+
- `/health?verbose=1` exposes enough status to show that the backend is enabled
81+
and bounded.
82+
- Negative smoke proves tools/media are rejected or handled explicitly.
83+
- Docs make clear this is not the same execution model as ordinary Cascade chat.
84+
85+
## WebSearch/WebFetch Plan
86+
87+
Current facts:
88+
89+
- Direct `GetWebSearchResults` is confirmed and should remain the preferred
90+
WebSearch investigation route.
91+
- No descriptor-backed direct WebFetch/read-url endpoint has been confirmed.
92+
`RecordReadUrlContent` is not a fetch endpoint.
93+
- Official WebFetch flow appears to be LS `requested_interaction` plus
94+
`HandleCascadeUserInteraction`, followed by a later trajectory step.
95+
- v2.0.141 added `webFetchTrace.state` summaries, but the VPS canary did not
96+
send the request because LS capacity preflight refused with
97+
`ls_capacity:memory_guard`.
98+
99+
Next valid run:
100+
101+
- Use an isolated or local environment with enough LS memory budget.
102+
- Gate by one API key, one account, one model, and one tool.
103+
- Enable proto trace and `WINDSURFAPI_NATIVE_TOOL_BRIDGE_WEBFETCH_AUTO_APPROVE`
104+
only for an allowlisted safe origin such as `https://example.com`.
105+
- Success requires `completed_web_document` or equivalent verified document
106+
payload. `pending_permission`, `auto_run_decision_only`, natural-language
107+
narration, or a repeated prompt is not success.
108+
109+
Do not implement a WebFetch direct endpoint by name guessing. Do not production
110+
allowlist `WebSearch` or `WebFetch` until the trace shows a real completed
111+
payload and the execution boundary is documented.
112+
113+
## Native Bridge Boundary
114+
115+
The mature production canary remains the Bash family:
116+
117+
- `Bash`
118+
- `shell_command`
119+
- `run_command`
120+
121+
Everything else is mapped for protocol lab work, not default production use:
122+
123+
- `Read`
124+
- `Grep`
125+
- `Glob`
126+
- `WebSearch`
127+
- `WebFetch`
128+
129+
Native bridge means remote Windsurf workspace execution. It is not a generic
130+
fix for local IDE tools, MCP tools, `apply_patch`, or arbitrary client-side
131+
tools. If a client mixes native-mapped tools with custom tools, prefer prompt
132+
emulation unless a narrow test proves the exact route.
133+
134+
## Recent Release Context
135+
136+
- v2.0.137: dashboard pagination, release scan hardening, bounded release gate.
137+
- v2.0.139: full shard stabilization and docs around native bridge boundaries.
138+
- v2.0.140: upstream cooldowns surfaced as real upstream cooldowns.
139+
- v2.0.141: `ToolRoute[...]` diagnostics and route-specific tool handling
140+
improvements.
141+
- v2.0.142: partial stream error tails are cleaned so post-content error JSON
142+
is not appended to already-visible streamed assistant content.
143+
144+
This sequence improved observability and some concrete edge cases, but it did
145+
not make Read/WebFetch/SWE-1.6 production-ready. Future notes should preserve
146+
that distinction.

docs/native-bridge-protocol-notes.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -286,6 +286,18 @@ valid canary must send `HandleCascadeUserInteraction` and then verify whether
286286
the same trajectory advances to `read_url_content.web_document`, an error step,
287287
or another requested interaction.
288288

289+
The v2.0.141/v2.0.142 state:
290+
291+
- `scripts/native-bridge-smoke.mjs` can summarize
292+
`semantic.steps[].webFetchTrace.state` so a canary does not require manual
293+
raw-trace reading for the first classification pass.
294+
- A narrow VPS WebFetch canary was prepared with API-key gating, one model, one
295+
tool, and an allowlisted safe origin, but the request did not run because LS
296+
capacity preflight refused with `ls_capacity:memory_guard`.
297+
- That memory-guard refusal is not WebFetch protocol evidence. The next valid
298+
run must use an isolated or local environment with enough LS memory budget.
299+
Do not bypass the production VPS guard just to force the canary.
300+
289301
## Direct Web Search API
290302

291303
`GetWebSearchResults` is confirmed independently of the LS-native tool path:

0 commit comments

Comments
 (0)