|
| 1 | +# Audit 2026-06-07 |
| 2 | + |
| 3 | +Scope: current GitHub issues, open PRs, recent release state, SWE-1.6, |
| 4 | +WebSearch/WebFetch, native bridge boundaries, and project working rules. |
| 5 | + |
| 6 | +## Baseline |
| 7 | + |
| 8 | +- Local and remote HEAD: `v2.0.142` (`72e1b9c`, |
| 9 | + `fix: clean partial stream error tails`). |
| 10 | +- Working tree at audit start: clean. |
| 11 | +- Open PRs: none. |
| 12 | +- VPS deployment: healthy on `v2.0.142` through the WindsurfAPI compose entry |
| 13 | + (`:3003` in the current VPS deployment). `/health?verbose=1` reports version |
| 14 | + `2.0.142` and commit `72e1b9cf079e`; authenticated `/v1/models` and a basic |
| 15 | + chat smoke returned HTTP 200 after deployment. Do not use the VPS public port |
| 16 | + 80 Apache/PHP 404 page as a health signal for this service. |
| 17 | +- Recent closed issue cluster: #191, #189, #176, and #180 were closed after |
| 18 | + rate-limit / provider-deadline / cooldown handling was documented and |
| 19 | + surfaced more clearly. |
| 20 | +- Current open issues: #190, #186, #185, #183, #178, #177, and #169. |
| 21 | + |
| 22 | +## Open Issue Triage |
| 23 | + |
| 24 | +| Issue | Current problem | Keep open because | Next evidence needed | |
| 25 | +| --- | --- | --- | --- | |
| 26 | +| #177 | Broad "model degraded / tool failures" bucket. Causes can include model family, tool schema size, tool-choice translation, prompt emulation limits, and upstream section behavior. | v2.0.141 added `ToolRoute[...]` diagnostics, but there is no single confirmed repro left to close the bucket. | Client, route, model, tool names/count, `ToolRoute[...]`, and `Probe[...]` logs for a failing request. | |
| 27 | +| #178 | "No tools get called" reports from Kilo Code, opencode, Codex-like clients. | The proxy can now distinguish stripped tools, forced missing tools, native gate misses, compacted preambles, and model narration without tool calls; reporters still need concrete logs. | Same as #177, plus whether native bridge was off, narrow, or `all_mapped`. | |
| 28 | +| #183 | Claude Code / OpenWebUI web-search flow can lose or repeat user input after search. | WebSearch/WebFetch LS-native protocol is still lab-only; the direct WebSearch API works, but WebFetch direct endpoint is not confirmed. | A memory-safe gated WebFetch/WebSearch canary with proto trace and `webFetchTrace.state`, not a production VPS guard bypass. | |
| 29 | +| #185 | Cursor truncation and stray JSON in streamed answers. | v2.0.142 fixed one concrete post-content error JSON tail, but upstream long-stream provider deadlines can still truncate content. | Reporter retest on `v2.0.142`; if JSON still appears, capture route, stream/non-stream, and debug request logs. | |
| 30 | +| #186 | Gemini / DeepSeek model request plus SWE-1.6 mention. | Normal model additions depend on Windsurf upstream/cloud catalog; SWE-1.6 is split to #190 and must not be tracked here as normal catalog work. | Upstream catalog evidence for Gemini/DeepSeek, or a model name that is present upstream but missing locally. | |
| 31 | +| #190 | SWE-1.6 / SWE-1.6-fast works in official tools but not in direct Cascade chat. | Direct Cascade reports unknown/missing model path behavior; this should be a special-agent / Devin / ACP POC, not a normal enum/UID fix. | Devin-capable text-only smoke, then ACP initialize/auth/session/prompt validation. | |
| 32 | +| #169 | Dashboard account card display mode request. | Product requirement is underspecified and lower priority than protocol/tool correctness. | Exact desired modes, for example compact rows, grouped-by-status, or grouped-by-model. | |
| 33 | + |
| 34 | +No currently open issue should be closed only from the information above. The |
| 35 | +right pattern is to add a closing comment only when the specific acceptance |
| 36 | +condition is met and a released version is available. |
| 37 | + |
| 38 | +## Priority Order |
| 39 | + |
| 40 | +1. Keep #177/#178/#185 as reproduction buckets and require the new diagnostics |
| 41 | + before making more tool-call claims. v2.0.141/v2.0.142 already improved |
| 42 | + observability and one streaming edge; the next work is evidence collection, |
| 43 | + not broad native-bridge enablement. |
| 44 | +2. Implement the SWE-1.6 special-agent POC as a separate backend. Start with |
| 45 | + text-only Devin CLI print mode, default off, no local client tools, no media, |
| 46 | + and bounded process/output limits. |
| 47 | +3. Continue WebSearch/WebFetch in a lab environment with enough LS memory |
| 48 | + budget. Do not bypass the production VPS memory guard to force a canary. |
| 49 | +4. Keep Read/Grep/Glob/WebSearch/WebFetch out of the default production native |
| 50 | + allowlist until protocol fields and runtime semantics are confirmed by real |
| 51 | + traces. |
| 52 | +5. Address #169 after the protocol/tool work has stable evidence. Dashboard |
| 53 | + pagination from #168 is already done; #169 is about additional view modes. |
| 54 | +6. Treat #186 as upstream/catalog watch work. Do not invent Gemini/DeepSeek |
| 55 | + support before Windsurf exposes usable upstream catalog entries. |
| 56 | + |
| 57 | +## SWE-1.6 Plan |
| 58 | + |
| 59 | +SWE-1.6 is not a normal catalog patch. Do not "fix" it by adding or changing a |
| 60 | +Cascade UID unless a real official trace proves that direct Cascade accepts it. |
| 61 | + |
| 62 | +POC shape: |
| 63 | + |
| 64 | +- `WINDSURFAPI_SPECIAL_AGENT_BACKEND=devin-cli` |
| 65 | +- `swe-1.6` and `swe-1.6-fast` remain hidden from normal `/v1/models` unless |
| 66 | + the special backend is explicitly enabled. |
| 67 | +- Initial mode is text-only. Requests with client-local tools or media should |
| 68 | + return a clear unsupported-boundary error, not silently execute in a different |
| 69 | + workspace. |
| 70 | +- Process management must have explicit max processes, timeout, output byte |
| 71 | + limit, and account/session binding before production recommendation. |
| 72 | +- ACP mode is second phase: initialize/auth/session/prompt first, then |
| 73 | + permission handling. Default permission answer should be deny/cancel until a |
| 74 | + safe mapping exists. |
| 75 | + |
| 76 | +Acceptance before closing #190: |
| 77 | + |
| 78 | +- A real `swe-1.6-fast` or `swe-1.6` smoke succeeds through the special-agent |
| 79 | + backend in a Devin-capable environment. |
| 80 | +- `/health?verbose=1` exposes enough status to show that the backend is enabled |
| 81 | + and bounded. |
| 82 | +- Negative smoke proves tools/media are rejected or handled explicitly. |
| 83 | +- Docs make clear this is not the same execution model as ordinary Cascade chat. |
| 84 | + |
| 85 | +## WebSearch/WebFetch Plan |
| 86 | + |
| 87 | +Current facts: |
| 88 | + |
| 89 | +- Direct `GetWebSearchResults` is confirmed and should remain the preferred |
| 90 | + WebSearch investigation route. |
| 91 | +- No descriptor-backed direct WebFetch/read-url endpoint has been confirmed. |
| 92 | + `RecordReadUrlContent` is not a fetch endpoint. |
| 93 | +- Official WebFetch flow appears to be LS `requested_interaction` plus |
| 94 | + `HandleCascadeUserInteraction`, followed by a later trajectory step. |
| 95 | +- v2.0.141 added `webFetchTrace.state` summaries, but the VPS canary did not |
| 96 | + send the request because LS capacity preflight refused with |
| 97 | + `ls_capacity:memory_guard`. |
| 98 | + |
| 99 | +Next valid run: |
| 100 | + |
| 101 | +- Use an isolated or local environment with enough LS memory budget. |
| 102 | +- Gate by one API key, one account, one model, and one tool. |
| 103 | +- Enable proto trace and `WINDSURFAPI_NATIVE_TOOL_BRIDGE_WEBFETCH_AUTO_APPROVE` |
| 104 | + only for an allowlisted safe origin such as `https://example.com`. |
| 105 | +- Success requires `completed_web_document` or equivalent verified document |
| 106 | + payload. `pending_permission`, `auto_run_decision_only`, natural-language |
| 107 | + narration, or a repeated prompt is not success. |
| 108 | + |
| 109 | +Do not implement a WebFetch direct endpoint by name guessing. Do not production |
| 110 | +allowlist `WebSearch` or `WebFetch` until the trace shows a real completed |
| 111 | +payload and the execution boundary is documented. |
| 112 | + |
| 113 | +## Native Bridge Boundary |
| 114 | + |
| 115 | +The mature production canary remains the Bash family: |
| 116 | + |
| 117 | +- `Bash` |
| 118 | +- `shell_command` |
| 119 | +- `run_command` |
| 120 | + |
| 121 | +Everything else is mapped for protocol lab work, not default production use: |
| 122 | + |
| 123 | +- `Read` |
| 124 | +- `Grep` |
| 125 | +- `Glob` |
| 126 | +- `WebSearch` |
| 127 | +- `WebFetch` |
| 128 | + |
| 129 | +Native bridge means remote Windsurf workspace execution. It is not a generic |
| 130 | +fix for local IDE tools, MCP tools, `apply_patch`, or arbitrary client-side |
| 131 | +tools. If a client mixes native-mapped tools with custom tools, prefer prompt |
| 132 | +emulation unless a narrow test proves the exact route. |
| 133 | + |
| 134 | +## Recent Release Context |
| 135 | + |
| 136 | +- v2.0.137: dashboard pagination, release scan hardening, bounded release gate. |
| 137 | +- v2.0.139: full shard stabilization and docs around native bridge boundaries. |
| 138 | +- v2.0.140: upstream cooldowns surfaced as real upstream cooldowns. |
| 139 | +- v2.0.141: `ToolRoute[...]` diagnostics and route-specific tool handling |
| 140 | + improvements. |
| 141 | +- v2.0.142: partial stream error tails are cleaned so post-content error JSON |
| 142 | + is not appended to already-visible streamed assistant content. |
| 143 | + |
| 144 | +This sequence improved observability and some concrete edge cases, but it did |
| 145 | +not make Read/WebFetch/SWE-1.6 production-ready. Future notes should preserve |
| 146 | +that distinction. |
0 commit comments