|
| 1 | +# Protocol RE Completion Plan |
| 2 | + |
| 3 | +Status: planning artifact (2026-06-10). Goal: make WindsurfAPI's reverse-engineered |
| 4 | +Cascade protocol layer complete/correct ("protocol perfect"), at cliproxyapi-class |
| 5 | +quality. Execute the heavy steps via Codex when its backend recovers (it was returning |
| 6 | +`503` from `v-api.hk.yesfuture.ai` at planning time). Claude orchestrates + verifies on |
| 7 | +the lab box; do not burn Claude tokens on bulk reads/traces that Codex can do. |
| 8 | + |
| 9 | +## Lab box runbook (the RE asset) |
| 10 | + |
| 11 | +- Host `154.40.36.22` (Debian 12, 7.8G RAM). Password SSH via `scripts/vps-exec.py` |
| 12 | + (`WINDSURFAPI_VPS_HOST/USER/PASS` env). Credentials are held by the user, session-only — |
| 13 | + NEVER write them to disk/git/logs. (SSH key auth is rejected by this image's sshd.) |
| 14 | +- Service: systemd unit `windsurfapi`, repo `/root/WindsurfAPI`, running v2.0.144, |
| 15 | + 10 active accounts (HAIKU-ONLY — all sonnet variants return `model_not_available`). |
| 16 | +- Canary env drop-in: `/etc/systemd/system/windsurfapi.service.d/canary.conf` |
| 17 | + (native-bridge gate, auto-approve, proto trace). Edit + `systemctl daemon-reload && |
| 18 | + systemctl restart windsurfapi`. |
| 19 | +- Proto trace dir: `/root/WindsurfAPI/data/proto-trace/` (files |
| 20 | + `ls-proto-<pid>-<RPC>.jsonl`). `WINDSURFAPI_PROTO_TRACE_STRINGS=1` captures string |
| 21 | + bodies (incl. prompts) — **lab-only; turn it OFF as the first resume step.** |
| 22 | +- Remote native-tool workspace (where view_file/grep/find/run_command execute): |
| 23 | + `/home/user/projects/workspace-<hash>` (discovered via Bash native `pwd`). To test the |
| 24 | + file tools, drop known files there first. |
| 25 | + |
| 26 | +## Findings locked in so far (real haiku traces, 2026-06-10) |
| 27 | + |
| 28 | +- Native WebFetch works end-to-end and is FIXED in v2.0.144 (the LS fetches and returns a |
| 29 | + real `web_document`; the proxy used to drop it). See [docs/native-bridge-protocol-notes.md]. |
| 30 | +- Trajectory step `type` ↔ native oneof, OBSERVED co-occurrences (not a full confirmed |
| 31 | + schema; docs warn `type` is not a reliable body-field number): |
| 32 | + - `type=21` carries the `run_command` (Bash) native oneof. **NEW.** |
| 33 | + - `type=31` carries the `read_url_content` (WebFetch) oneof, alongside a |
| 34 | + `requested_interaction` echo and the `web_document`. |
| 35 | + - `type=14` carries `readWrapperField19` (the Read/view_file environment+prompt wrapper). |
| 36 | + - `type=34`, `type=15` appear at the head of every trajectory (preamble / planner-status). |
| 37 | + - `type=8`, `type=23` appear in the Read trajectory (post-tool / status) — unmapped. |
| 38 | + - Per docs, oneof FIELD numbers: `read_url_content`=40, `search_web`=42 (distinct from step `type`). |
| 39 | +- Read/Grep/Glob native execution targets the REMOTE stub workspace, useless for clients |
| 40 | + that want their LOCAL files. So returning a tool-call PROPOSAL (client executes locally) |
| 41 | + is the correct default for these — unlike WebFetch (URLs are location-independent). |
| 42 | + |
| 43 | +## Work items (priority order; the user picked "deep protocol RE completion") |
| 44 | + |
| 45 | +### 1. Confirmed trajectory step-type map (highest RE value) |
| 46 | +Currently the parser treats `type` as unreliable. Build a confirmed map by tracing each |
| 47 | +native tool on the lab box and tabulating `{type, nativeOneofs, messageFields}` per step. |
| 48 | +- Tools to trace: Read(view_file), Grep(grep_search_v2), Glob(find), list_dir(list_directory), |
| 49 | + Bash(run_command, done=21), WebFetch(read_url_content, done=31), WebSearch(search_web). |
| 50 | +- Deliverable: a table in native-bridge-protocol-notes.md mapping step type → meaning → |
| 51 | + oneof field, with the haiku-trace evidence. Update the parser to key off confirmed fields. |
| 52 | + |
| 53 | +### 2. Per-tool round-trip confirmation (place test files in the remote workspace first) |
| 54 | +For each file tool, confirm the LS executes and returns a result oneof, and that the parser |
| 55 | +extracts it. Watch for the same class of bug as the WebFetch one (result present but dropped |
| 56 | +because a requested_interaction/pending echo is checked first). |
| 57 | +- view_file result step type + field; grep_search_v2 result shape; find result shape; |
| 58 | + list_directory result shape. |
| 59 | + |
| 60 | +### 3. Confirm the still-unconfirmed subconfig fields |
| 61 | +- `GrepV2ToolConfig` exact field number for `allow_access_gitignore` (docs: needs descriptor |
| 62 | + dump or CONFIG_RAW matrix). |
| 63 | +- `ListDirToolConfig` non-empty fields. |
| 64 | +- Use `WINDSURFAPI_NATIVE_TOOL_BRIDGE_CONFIG_RAW` matrix on the lab box to bisect field numbers. |
| 65 | + |
| 66 | +### 4. Resolve the Read wrapper `type=14 field=19` schema |
| 67 | +Use `semantic.steps[].readWrapperField19.candidateSummary` across traces to decide path-vs-prompt |
| 68 | +field handling; replace the current stop-loss guard with a confirmed rule. |
| 69 | + |
| 70 | +### 5. Endpoint breadth (secondary — protocol surface parity with cliproxyapi) |
| 71 | +Server currently exposes: `/v1/chat/completions`, `/v1/responses` (+`/v1/response`), |
| 72 | +`/v1/messages` (Anthropic), `/v1/models`, `/auth/*`, `/dashboard/*`, `/health`. |
| 73 | +Missing vs cliproxyapi-class: |
| 74 | +- Anthropic `/v1/messages/count_tokens` (Claude Code calls it; missing = 404). HIGH client-compat. |
| 75 | +- Gemini format `/v1beta/models/{model}:generateContent` + `:streamGenerateContent`. |
| 76 | +- OpenAI `/v1/embeddings` (only if Cascade exposes embeddings — verify first). |
| 77 | +- OpenAI legacy `/v1/completions` (minor). |
| 78 | + |
| 79 | +## Codex delegation queue (run when 503 clears) |
| 80 | +- A: external benchmark — cliproxyapi (router-for-me/CLIProxyAPI) + kiro.rs feature/protocol |
| 81 | + matrix (retry the task that 503'd). |
| 82 | +- B: internal protocol-RE inventory (retry the task that 503'd) — confirmed vs stub map. |
| 83 | +- C: implement endpoint(s) from item 5 (start with count_tokens) with tests. |
| 84 | +- D: given lab-box traces Claude supplies, update windsurf.js/proto-trace.js parsers + tests |
| 85 | + per items 1–4. |
| 86 | +Pattern: Codex does bulk reads/writes/tests; Claude supplies trace evidence from the lab box |
| 87 | +and verifies results. Keep credentials out of Codex prompts/logs. |
| 88 | + |
| 89 | +## Verification + release rules (unchanged) |
| 90 | +Per [docs/MAINTAINER_NOTES.md]: focused tests + `npm run test:release` + `npm run secret-scan`, |
| 91 | +full shards on non-trivial blast radius, then commit/tag/push, verify CI/Release, deploy + |
| 92 | +smoke. Production VPS (43.153.139.136) still runs v2.0.142; deploying needs prod creds. |
| 93 | +Never widen native-bridge production defaults from a single lab success. |
0 commit comments