You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.en.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -95,7 +95,7 @@ Stash/restore buttons next to the composer + a "Stash" list below the Usage pane
95
95
- Translate Codex App's Responses API streaming / non-streaming requests into upstream protocols: Chat Completions, Gemini Native (`:streamGenerateContent`), Gemini CLI OAuth (Cloud Code Assist), Anthropic Messages (`/v1/messages`), Grok Web (`/rest/app-chat/conversations/new`), Responses passthrough, etc.
96
96
- Multi-turn tool conversation context + `previous_response_id` history replay + autocompact expansion + thinking / reasoning_content injection — all aligned with the OpenAI Responses API protocol; remote compact supported on both protocol generations: the legacy `/responses/compact` endpoint plus remote compaction v2 (a regular streaming `/responses` request carrying a `compaction_trigger` marker, answered with an SSE stream containing a single compaction item) — newer Codex builds previously failed autocompact with `expected exactly one compaction output item`, now fixed (MOC-198)
97
97
- Reasoning (thinking) blocks display correctly in current Codex Desktop: reasoning streams on the **summary channel** (`reasoning_summary_text.delta` — verified against official gpt-5.5 wire @ v26.623, which is summary-only); MOC-203 originally dual-emitted a content channel for compatibility, but dual emission made Codex update the same thinking block twice per token and flicker during streaming, so it has been consolidated back to the single channel (MOC-293). Chat path also fixes interleaving of reasoning and tool_call stream events (reasoning is closed before opening a new tool item); gemini path fixes tool-call grouping (`functionCall`-following empty text parts no longer produce a blank message item, so same-turn tools fold correctly) (MOC-203)
98
-
- **Automatic turn collapse after task completion (matches official GPT behavior, MOC-293)**: for providers that go through the Chat Completions conversion (e.g. GLM / Kimi / DeepSeek / WorkBuddy), the whole working process of a turn (thinking + tool calls + preamble messages) collapses into a "Worked for Ns" divider once the final answer arrives, leaving only the final reply expanded — same as with an official ChatGPT account. Implementation: the chat conversion path adds the official wire's top-level `phase` field to assistant message items (tool-round preambles → `commentary`, grouped into the collapsed process area; the final answer → `final_answer`, expanded); on this path the message closes only at stream end, so at done all tool calls are already visible and the phase is naturally authoritative — streaming `output_item.added` carries a provisional `commentary`, and the collapse fires exactly once at the true final answer with no mid-task jitter. Note: the final answer streams in the process area and is promoted when it completes (third-party models cannot pre-declare a message channel the way GPT does). **Not yet covered**: the gemini_native (Antigravity) and anthropic_messages (anyrouter) conversion paths close the message before the tool call appears in the stream, so the terminal phase cannot be reliably determined at close time (a wrong guess would collapse-then-expand on every tool round); `phase` is left off there for now (followup); grok_web and Responses passthrough do not go through this conversion and are unaffected
98
+
- **Automatic turn collapse after task completion (matches official GPT behavior, MOC-293 / MOC-295)**: for providers that go through the Chat Completions conversion (e.g. GLM / Kimi / DeepSeek / WorkBuddy), the whole working process of a turn (thinking + tool calls + preamble messages) collapses into a "Worked for Ns" divider once the final answer arrives, leaving only the final reply expanded — same as with an official ChatGPT account. Implementation: all three conversion paths (chat / gemini_native / anthropic_messages) add the official wire's top-level `phase` field to assistant message items (tool-round preambles → `commentary`, grouped into the collapsed process area; the final answer → `final_answer`, expanded); streaming `output_item.added` always carries a provisional `commentary`, and the message `output_item.done` is deferred to the terminal side (once stop_reason / has_seen_tool_calls is known) where it emits with the authoritative phase — on the chat path the message closes only at stream end (naturally authoritative), while gemini / anthropic paths determine phase at emit_terminal / emit_completed using the terminal signal (anthropic `final_stop_reason` / gemini `has_seen_tool_calls`), ensuring the collapse fires exactly once at the true final answer with no mid-task jitter. Note: the final answer streams in the process area and is promoted when it completes (third-party models cannot pre-declare a message channel the way GPT does). grok_web and Responses passthrough do not go through this conversion and are unaffected
99
99
- Codex App's freeform `apply_patch` tool (edit-file +/- diff UI) works on chat-completions providers: the adapter bridges Responses `custom_tool_call` ↔ chat `function_call` wire forms, the model emits V4A-format patches, Codex App renders the diff (issue #235); Gemini-family providers (gemini_native + Cloud Code Assist / Antigravity, using generateContent) now have the same bridge via MOC-75: on the request side, freeform `custom` tools are downgraded to a function with an `input` string parameter (V4A description reuses the chat constants); on the response side, Gemini's `functionCall` is repacked into a `custom_tool_call` wire
100
100
- **apply_patch middle layer (format recovery)**: third-party chat models lack GPT's lark-grammar-constrained generation, so they often emit malformed V4A (double-sided `@@`, missing `+` on Add File lines, byte-mismatched context, missing `*** Begin/End Patch` envelope, dropped blank lines, missing line prefixes, **multiple discontiguous hunks dropping the `@@` separator**, etc.). The middle layer recovers each known error to valid format before sending to Codex — reading the file from disk to align `@@` anchors / context to real bytes, restoring dropped blank lines, converting empty-file / rename-only into `Delete+Add`, **auto-segmenting a multi-region edit that omits `@@` by the hunks' real file positions and inserting bare `@@`** (MOC-263 P0: only when uniquely segmentable; ambiguous floating `+` placement passes through), etc.; **non-destructive** (never loses content or overwrites) and **passes unknown cases through untouched** (let Codex error so the model self-corrects, never guesses). The disk-read cwd is resolved from **the most recent few cwd candidates** rather than a single global value, so the fallback no longer breaks under concurrent multi-session traffic where another project's cwd would clobber it (MOC-263 P1). For the common case where the model mis-prefixes the terminator (`+*** End Patch`, a slip when `+`-prefixing every Add File line) and leaves a stray `*** End Patch` line in the created file: the root fix is in the injected guidance (teach the model not to prefix the terminator); the middle layer, on encountering such a prefixed terminator, disambiguates by file type — for code / structured-config files (where a bare `*** End Patch` line can never be valid source) it strips the prefix to a bare terminator; for doc / text / unknown types, where that line could legitimately be real last-line content, it does NOT guess (neither strips, which could delete content, nor appends, which would leave residue) and leaves the patch incomplete so the model re-issues per the guidance. The chat-path apply_patch guidance is optimized alongside (context lines must already exist / don't re-delete already-removed lines in sequential edits / dedicated guidance for memory files) (MOC-268). It mirrors the V4A lark grammar Codex constrains GPT with, enforced post-hoc on the chat path (credit to [openai/codex](https://github.com/openai/codex)'s apply_patch lark grammar) (MOC-194 / MOC-263 / MOC-268)
101
101
-**Native image generation on Antigravity (MOC-210)**: Codex's built-in `image_gen` tool now actually generates images on the Antigravity provider (native, not the CLI fallback). When the model calls `image_gen` mid-conversation, the proxy intercepts the call in the Gemini response stream, issues an image sub-request (defaults to `gemini-3.1-flash-image`, overridable via the `gpt-image-1` model slot in the provider config), and inlines the returned image as an `image_generation_call` back to Codex for rendering; text/reasoning still stream live, and the image turn is recorded in history to avoid duplicate generations. The tool is exposed only to Antigravity (other providers have no image backend).
0 commit comments