Skip to content

Commit c3ba2fa

Browse files
authored
feat(adapters): gemini_native / anthropic_messages 整轮折叠补齐 — 收尾侧定终态 phase (MOC-295) (#579)
* feat(adapters): gemini_native / anthropic_messages 整轮折叠补齐 — 收尾侧定终态 phase (MOC-295) followup of MOC-293 (PR #577). chat converter 已在流末 close 时天然权威地给 message phase;gemini_native 和 anthropic_messages 两路因流内块序在 message close 时无法可靠判定终态而被回退。 本 PR 补齐:option A — message output_item.done 延迟到收尾侧 emit,带权威 phase。 - anthropic_messages:close_text 暂存 item 到 pending_message_done,emit_terminal/emit_failure flush 时按 final_stop_reason 定 phase(tool_use → commentary,其余 → final_answer) - gemini_native:close_message 暂存到 pending_message_done,emit_completed flush 时按 has_seen_tool_calls 定 phase(true → commentary,false → final_answer) - 两路 open(item.added) 一律发临时 commentary,跟 chat converter 一致 - 新增测试:text_then_tool_use_message_phase_is_commentary 验证 text+tool 序列的 message done phase = commentary - 现有测试加 phase 断言(text_stream → final_answer) - README 中英更新折叠表述覆盖三路 * docs(agents): 优化收尾流程 — sibling worktree 复用 + e2/e3 修正 e2: 去掉 --delete-branch(sibling worktree 锁 main 导致 merge abort)。 e3: worktree 不删除、复用于下一任务(detach → 删分支 → 开新分支)。 4a: 新任务无关时复用当前 worktree 开新分支,不再创建新 worktree。
1 parent 21d81f6 commit c3ba2fa

6 files changed

Lines changed: 191 additions & 37 deletions

File tree

AGENTS.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
**a. 接收新任务 → 判断工作环境**
1313
- 主仓 `~/alysechen/github/codex-app-transfer/` 永远 checkout `main` 不参与开发;所有 feature 任务一律在 sibling worktree(`codex-app-transfer-worktrees/<branch>/`)工作。
1414
- 若已在某个 worktree 中且任务与之相关,继续在该 worktree 上工作。
15-
- 若已在某个 worktree 中但新任务与之无关,为新任务创建新的 worktree。此时需先检查旧 worktree 的状态(分支、未提交变更、关联 PR 状态等),并在新任务结束时向用户汇报旧 worktree 的详情,方便用户对旧 worktree 做决策
15+
- 若已在某个 worktree 中但新任务与之无关,**复用当前 worktree 开新分支**(不再创建新 worktree):先确认旧任务已收尾(分支已 merge、PR 已关、工作区干净),再 `git fetch origin main` 同步最新,最后 `git checkout -b <new-branch> origin/main` 在当前 worktree 直接开新分支。这样省去重建 worktree 的开销,在 Codex Desktop 的单 worktree 模型下更合理
1616

1717
**b. 任务完成 → 提交 + 创建 PR + Review**
1818
- 完成开发后 push 到远端分支,创建 PR 并进行 review。后台监测 PR 状态(CI checks、review threads、merge state),出现失败或阻塞时主动处理。
@@ -33,8 +33,13 @@
3333
**e. 用户显式声明 merge → rebase + 完整清理**
3434
- **e0. (stacked PR only)解耦 child PR base**:merge 前若存在以本 PR head branch 为 base 的 open child PR,必须先 `gh pr edit <child> --base main`。否则 `gh pr merge --squash --delete-branch` 删 head branch 时 GitHub **会自动关闭** child PR(不是改 base,是 CLOSED + base ref 不存在),补救需 4 步 API mutation 重建 ref → reopen → 改 base → 删 ref。
3535
- **e1. Rebase**:先对目标分支执行 `rebase`;若无冲突或冲突少且简单,AI 自行解决后继续;若冲突较多或涉及复杂逻辑 / 重要决策,必须向用户提供解决方案并获得确认后再执行,**禁止自行决定修改方向****特殊情况**:child PR base 已被 squash 进 main 时 rebase 大概率假冲突(squash merge 不是 patch-identical 原始 commits)→ `git rebase --abort``git reset --hard origin/main` → cherry-pick 该 PR 独有 commits,**不要硬继续 rebase**
36-
- **e2. Merge + 远端 silent delete verify**`gh pr merge <PR#> --squash --delete-branch` 后必须验证远端 ref 真删 —— **不能直接看 `git ls-remote` 的 exit code**(连接成功即 0,跟 ref 存不存在无关),改用 `git ls-remote --heads --exit-code origin <branch>`(ref 不存在时 exit 2)**** `[ "$(git ls-remote --heads origin <branch> | wc -l)" -eq 0 ]`(stdout 0 行 = ref 不存在);残留时手动 `git push origin --delete <branch>`(worktree 锁本地分支时 gh 也 skip remote delete,silent failure)。
37-
- **e3. 本地清理**`git worktree remove <path>``git branch -D <branch>``git worktree prune` → 清理 `src-tauri/target/release/bundle/macos/` 等 build 残留。
36+
- **e2. Merge + 远端 silent delete verify**`gh pr merge <PR#> --squash`**不要带 `--delete-branch`**,sibling worktree 锁 main 导致 gh 内部 `git checkout main` 失败 → merge 本身报错 abort)后必须验证远端 ref 真删 —— **不能直接看 `git ls-remote` 的 exit code**(连接成功即 0,跟 ref 存不存在无关),改用 `git ls-remote --heads --exit-code origin <branch>`(ref 不存在时 exit 2)**** `[ "$(git ls-remote --heads origin <branch> | wc -l)" -eq 0 ]`(stdout 0 行 = ref 不存在);残留时手动 `git push origin --delete <branch>`
37+
- **e3. 本地清理(sibling worktree 专用顺序)**:Codex Desktop 管理的 worktree 无法在对话中 `git worktree remove`(目录归 Codex Desktop 生命周期管理),且 sibling worktree 不能 checkout `main`(主仓永远锁 main)。收尾后**复用当前 worktree 继续下一个任务**,不删除 worktree 目录。按以下顺序操作:
38+
1. **在 worktree 内 detach 释放分支锁**`git checkout --detach origin/main`(在当前 worktree 目录执行;fetch 后 origin/main 已是 merge 后最新)
39+
2. **在主仓删本地分支**`cd ~/alysechen/github/codex-app-transfer && git branch -D <branch>`(分支已无 worktree 锁定,删除成功)
40+
3. **worktree 目录保留,复用于下一任务**:worktree 现处于 detached HEAD 指向 `origin/main`、工作区干净;下一个任务直接 `git checkout -b <new-branch> origin/main` 在当前 worktree 开新分支即可,无需新建 worktree。不要手动 `git worktree remove`(跟 Codex Desktop 管理冲突)
41+
4. **可选 prune**`cd ~/alysechen/github/codex-app-transfer && git worktree prune`(清理历史已移除 worktree 的 stale 元数据,不影响当前在用 worktree)
42+
5. **清理 build 残留**`src-tauri/target/release/bundle/macos/`
3843
- **e4. 回归 main + 同步**`git checkout main``git pull --ff-only origin main`
3944
- **e6. 关联 issue + Linear followup 更新**`gh issue view <ISSUE#> --json state,closedByPullRequestsReferences` 验证是否被 PR `Closes #N` 自动关,否则手动 `gh issue close <N>`**Linear followup(workspace Mochance / team Mochance / label Improvement)跟 GitHub issue 是两套独立系统**:本次 PR 实施掉的 Linear issue(MOC-N)用 `mcp__linear__save_issue``state=Done`,并在 issue body 末尾追加 resolved PR 链接。(历史 `docs/followup-tracker.md` 制度 2026-05-24 起停用,新工作流不再写本地 .md。`docs/` 整目录已 gitignored。)
4045

README.en.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ Stash/restore buttons next to the composer + a "Stash" list below the Usage pane
9595
- Translate Codex App's Responses API streaming / non-streaming requests into upstream protocols: Chat Completions, Gemini Native (`:streamGenerateContent`), Gemini CLI OAuth (Cloud Code Assist), Anthropic Messages (`/v1/messages`), Grok Web (`/rest/app-chat/conversations/new`), Responses passthrough, etc.
9696
- Multi-turn tool conversation context + `previous_response_id` history replay + autocompact expansion + thinking / reasoning_content injection — all aligned with the OpenAI Responses API protocol; remote compact supported on both protocol generations: the legacy `/responses/compact` endpoint plus remote compaction v2 (a regular streaming `/responses` request carrying a `compaction_trigger` marker, answered with an SSE stream containing a single compaction item) — newer Codex builds previously failed autocompact with `expected exactly one compaction output item`, now fixed (MOC-198)
9797
- Reasoning (thinking) blocks display correctly in current Codex Desktop: reasoning streams on the **summary channel** (`reasoning_summary_text.delta` — verified against official gpt-5.5 wire @ v26.623, which is summary-only); MOC-203 originally dual-emitted a content channel for compatibility, but dual emission made Codex update the same thinking block twice per token and flicker during streaming, so it has been consolidated back to the single channel (MOC-293). Chat path also fixes interleaving of reasoning and tool_call stream events (reasoning is closed before opening a new tool item); gemini path fixes tool-call grouping (`functionCall`-following empty text parts no longer produce a blank message item, so same-turn tools fold correctly) (MOC-203)
98-
- **Automatic turn collapse after task completion (matches official GPT behavior, MOC-293)**: for providers that go through the Chat Completions conversion (e.g. GLM / Kimi / DeepSeek / WorkBuddy), the whole working process of a turn (thinking + tool calls + preamble messages) collapses into a "Worked for Ns" divider once the final answer arrives, leaving only the final reply expanded — same as with an official ChatGPT account. Implementation: the chat conversion path adds the official wire's top-level `phase` field to assistant message items (tool-round preambles → `commentary`, grouped into the collapsed process area; the final answer → `final_answer`, expanded); on this path the message closes only at stream end, so at done all tool calls are already visible and the phase is naturally authoritative — streaming `output_item.added` carries a provisional `commentary`, and the collapse fires exactly once at the true final answer with no mid-task jitter. Note: the final answer streams in the process area and is promoted when it completes (third-party models cannot pre-declare a message channel the way GPT does). **Not yet covered**: the gemini_native (Antigravity) and anthropic_messages (anyrouter) conversion paths close the message before the tool call appears in the stream, so the terminal phase cannot be reliably determined at close time (a wrong guess would collapse-then-expand on every tool round); `phase` is left off there for now (followup); grok_web and Responses passthrough do not go through this conversion and are unaffected
98+
- **Automatic turn collapse after task completion (matches official GPT behavior, MOC-293 / MOC-295)**: for providers that go through the Chat Completions conversion (e.g. GLM / Kimi / DeepSeek / WorkBuddy), the whole working process of a turn (thinking + tool calls + preamble messages) collapses into a "Worked for Ns" divider once the final answer arrives, leaving only the final reply expanded — same as with an official ChatGPT account. Implementation: all three conversion paths (chat / gemini_native / anthropic_messages) add the official wire's top-level `phase` field to assistant message items (tool-round preambles → `commentary`, grouped into the collapsed process area; the final answer → `final_answer`, expanded); streaming `output_item.added` always carries a provisional `commentary`, and the message `output_item.done` is deferred to the terminal side (once stop_reason / has_seen_tool_calls is known) where it emits with the authoritative phase — on the chat path the message closes only at stream end (naturally authoritative), while gemini / anthropic paths determine phase at emit_terminal / emit_completed using the terminal signal (anthropic `final_stop_reason` / gemini `has_seen_tool_calls`), ensuring the collapse fires exactly once at the true final answer with no mid-task jitter. Note: the final answer streams in the process area and is promoted when it completes (third-party models cannot pre-declare a message channel the way GPT does). grok_web and Responses passthrough do not go through this conversion and are unaffected
9999
- Codex App's freeform `apply_patch` tool (edit-file +/- diff UI) works on chat-completions providers: the adapter bridges Responses `custom_tool_call` ↔ chat `function_call` wire forms, the model emits V4A-format patches, Codex App renders the diff (issue #235); Gemini-family providers (gemini_native + Cloud Code Assist / Antigravity, using generateContent) now have the same bridge via MOC-75: on the request side, freeform `custom` tools are downgraded to a function with an `input` string parameter (V4A description reuses the chat constants); on the response side, Gemini's `functionCall` is repacked into a `custom_tool_call` wire
100100
- **apply_patch middle layer (format recovery)**: third-party chat models lack GPT's lark-grammar-constrained generation, so they often emit malformed V4A (double-sided `@@`, missing `+` on Add File lines, byte-mismatched context, missing `*** Begin/End Patch` envelope, dropped blank lines, missing line prefixes, **multiple discontiguous hunks dropping the `@@` separator**, etc.). The middle layer recovers each known error to valid format before sending to Codex — reading the file from disk to align `@@` anchors / context to real bytes, restoring dropped blank lines, converting empty-file / rename-only into `Delete+Add`, **auto-segmenting a multi-region edit that omits `@@` by the hunks' real file positions and inserting bare `@@`** (MOC-263 P0: only when uniquely segmentable; ambiguous floating `+` placement passes through), etc.; **non-destructive** (never loses content or overwrites) and **passes unknown cases through untouched** (let Codex error so the model self-corrects, never guesses). The disk-read cwd is resolved from **the most recent few cwd candidates** rather than a single global value, so the fallback no longer breaks under concurrent multi-session traffic where another project's cwd would clobber it (MOC-263 P1). For the common case where the model mis-prefixes the terminator (`+*** End Patch`, a slip when `+`-prefixing every Add File line) and leaves a stray `*** End Patch` line in the created file: the root fix is in the injected guidance (teach the model not to prefix the terminator); the middle layer, on encountering such a prefixed terminator, disambiguates by file type — for code / structured-config files (where a bare `*** End Patch` line can never be valid source) it strips the prefix to a bare terminator; for doc / text / unknown types, where that line could legitimately be real last-line content, it does NOT guess (neither strips, which could delete content, nor appends, which would leave residue) and leaves the patch incomplete so the model re-issues per the guidance. The chat-path apply_patch guidance is optimized alongside (context lines must already exist / don't re-delete already-removed lines in sequential edits / dedicated guidance for memory files) (MOC-268). It mirrors the V4A lark grammar Codex constrains GPT with, enforced post-hoc on the chat path (credit to [openai/codex](https://github.com/openai/codex)'s apply_patch lark grammar) (MOC-194 / MOC-263 / MOC-268)
101101
- **Native image generation on Antigravity (MOC-210)**: Codex's built-in `image_gen` tool now actually generates images on the Antigravity provider (native, not the CLI fallback). When the model calls `image_gen` mid-conversation, the proxy intercepts the call in the Gemini response stream, issues an image sub-request (defaults to `gemini-3.1-flash-image`, overridable via the `gpt-image-1` model slot in the provider config), and inlines the returned image as an `image_generation_call` back to Codex for rendering; text/reasoning still stream live, and the image turn is recorded in history to avoid duplicate generations. The tool is exposed only to Antigravity (other providers have no image backend).

0 commit comments

Comments
 (0)