Skip to content

feat(acp): Setup acp with gemini agent to replace API model#56

Closed
CSRessel wants to merge 40 commits intofork/upstream-mainfrom
fork/acp-with-gemini
Closed

feat(acp): Setup acp with gemini agent to replace API model#56
CSRessel wants to merge 40 commits intofork/upstream-mainfrom
fork/acp-with-gemini

Conversation

@CSRessel
Copy link
Copy Markdown
Collaborator

0th in stack

CSRessel and others added 30 commits November 18, 2025 15:30
Add two blackbox snapshot tests to codex-tui that verify UI rendering:
- blackbox_typing_snapshot: Tests that text appears in composer when set
- blackbox_model_picker_snapshot: Tests that /model popup renders correctly

These tests treat the ChatWidget as a black box, testing the rendering
pipeline by setting state through public interfaces and verifying the
visual output via insta snapshots.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add AcpModelClient that implements end-to-end streaming from ACP agents:
- Spawn agent subprocess and initialize with capabilities
- Create session and send prompts
- Stream AgentMessageChunk notifications as TextDelta events
- Map prompt response to Completed event

This thin slice proves the architecture works and provides foundation
for subsequent phases (bidirectional transport, client handlers,
session management, complete event mapping, error handling).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Adds the gemini-acp provider to the built-in model providers list to fix
the "Model provider `gemini-acp` not found" error on TUI startup. This
provider entry is needed for the ACP thin slice implementation to work
with the Gemini CLI agent.
Expose Gemini ACP model in the TUI model selection list and implement logic to
switch to the correct provider when a Gemini model is selected.
Add comprehensive TUI integration testing infrastructure:
- New tui-integration-tests crate with PTY session management
- Test coverage for startup, prompt flow, input handling, and cancellation
- Enhanced mock-acp-agent with configurable responses and delays
- Support for simulating realistic streaming behavior via env vars

The testing framework uses portable-pty and vt100 parser to drive
the TUI programmatically and verify screen contents, enabling
reliable end-to-end testing of terminal interactions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Crossterm queries cursor position during initialization via ESC[6n control
sequence, but the PTY emulator doesn't automatically respond. This caused
tests to timeout waiting for the response.

Added intercept_control_sequences() to detect cursor position queries in
the PTY output stream and write back ESC[1;1R responses, enabling the TUI
to initialize properly in the test environment.

Updated test expectations to match the initial welcome screen that now
appears, and added snapshot for regression detection.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…imeout hangs

Previously, tests using wait_for_text() would hang indefinitely when the
expected text never appeared, because the PTY reader was in blocking mode.
This caused read() to block forever waiting for data, preventing the timeout
mechanism from executing.

Changes:
- Set PTY master to non-blocking mode using fcntl(O_NONBLOCK) on Unix
- Add helper function set_nonblocking() using libc for low-level fcntl ops
- read() now returns WouldBlock immediately when no data is available
- Timeout checks in wait_for() can now execute, failing tests cleanly after 5s
- Add test_poll_does_not_block_when_no_data to verify non-blocking behavior
- Convert debug logging to conditional (DEBUG_TUI_PTY env var)
- Uncomment and fix previously disabled tests in startup.rs
- Update documentation with PTY non-blocking details and debugging info

Dependencies added (Unix only):
- nix = "0.27" (features: fs)
- libc = "0.2"

The fix ensures test timeouts work correctly, with tests failing cleanly
rather than hanging indefinitely when conditions are not met.
🤖 Generated with [Nori](https://nori.ai)

Co-Authored-By: Nori <noreply@tilework.tech>
…ession testing

Adds snapshot assertions to 4 startup tests to capture final screen states:
- startup_shows_welcome: Welcome screen with logo and /tmp/ path
- startup_welcome_dimensions_40x120: Welcome screen at 40x120 dimensions
- runs_in_temp_directory: Verification of /tmp/ directory usage
- trust_screen_skipped: Main prompt without trust screen

Excludes test_poll_does_not_block_when_no_data as it tests timing behavior, not UI.
🤖 Generated with [Nori](https://nori.ai)

Co-Authored-By: Nori <noreply@tilework.tech>
…overage

Move normalize_for_snapshot() and TIMEOUT constant from test modules to library exports
for consistent reuse across all tests. Consolidate streaming/cancellation tests into a
unified streaming.rs module.

Add Drop trait implementation to TuiSession that prints screen state on panic, improving
debugging of PTY timing issues. Expand snapshot testing coverage to input handling
scenarios (ctrl-c, typing, arrow navigation).

Add 100ms sleep delays after PTY input operations in tests to prevent race conditions
between sending input and TUI processing. Update documentation to reflect new exported
utilities, debugging aids, and PTY timing patterns.
…communication

Adds full support for Agent Client Protocol (ACP) as a wire API option alongside
existing Chat and Responses APIs. ACP enables subprocess-based agent communication
over JSON-RPC 2.0 via stdin/stdout instead of HTTP requests.

Core changes:
- Add ACP agent registry (acp/src/registry.rs) mapping provider names to binary commands
- Implement ModelClient::stream() for WireApi::Acp with event mapping to ResponseEvent
- Add "mock-acp" and "gemini-acp" model families for proper model resolution
- Configure mock-acp and gemini-acp providers with WireApi::Acp
- Add comprehensive unit tests (8/8 passing) covering providers, registry, and model families

Implementation follows existing Chat API patterns with subprocess spawning instead of HTTP.
Event mapping: AcpEvent::TextDelta → ResponseEvent::OutputTextDelta, etc.

Test coverage:
- Unit tests: Provider configuration, model families, registry integration
- Integration test: ModelClient streaming with mock-acp-agent binary
- TUI test: Updated to use "mock-acp" model name (model resolution pending)

Note: TUI integration test still needs model-name-to-provider-ID resolution work,
tracked separately as it's a config loading concern rather than core ACP implementation.
🤖 Generated with [Nori](https://nori.ai)

Co-Authored-By: Nori <noreply@tilework.tech>
Add init_file_tracing() function to enable structured file logging for the ACP package.

- Accepts custom log file path parameter
- Filters at DEBUG level and above (TRACE excluded)
- Uses non-blocking file appender for async-safe writes
- Returns error on re-initialization (global subscriber constraint)
- Includes comprehensive integration test
- Updates package documentation
🤖 Generated with [Nori](https://nori.ai)

Co-Authored-By: Nori <noreply@tilework.tech>
Add automatic initialization of ACP file tracing when the CLI starts.
Logs are written to .codex-acp.log in the current working directory.

- Add codex-acp dependency to cli/Cargo.toml
- Call init_file_tracing() in cli_main() after CLI argument parsing
- Log warning (non-fatal) if tracing initialization fails
- Update acp/docs.md to document automatic CLI initialization

This ensures ACP tracing is always available when using the codex CLI,
without requiring manual initialization by library users.
🤖 Generated with [Nori](https://nori.ai)

Co-Authored-By: Nori <noreply@tilework.tech>
…rary

Replaces custom protocol.rs and transport.rs implementations with the
official agent-client-protocol v0.7 library, eliminating ~250 lines of
duplicative JSON-RPC code.

Key changes:
- Implement Client trait in new client_handler.rs for permission/session callbacks
- Refactor AgentProcess to use ClientSideConnection instead of StdioTransport
- Add tokio-util compat layer for AsyncRead/Write trait compatibility
- Use LocalSet pattern throughout for handling !Send futures from the library
- Replace custom JsonRpcRequest/Response with typed library requests
- Update all tests to wrap in LocalSet contexts
- Fix clippy warnings (unused variables, unwrap usage, error formatting)

Benefits:
- Type-safe protocol communication with compile-time validation
- Spec-compliant implementation maintained by library authors
- Reduced maintenance burden and code complexity
- Better error handling with library-defined error types

Breaking changes:
- AgentProcess API updated with typed new_session() and prompt() methods
- ClientEvent enum introduced for permission requests and session updates
- Public API now exports library types instead of custom types
🤖 Generated with [Nori](https://nori.ai)

Co-Authored-By: Nori <noreply@tilework.tech>
Add comprehensive JSON request/response logging for ACP protocol debugging:
- Log initialize and new_session requests/responses with pretty-printed JSON
- Change protocol version from V1 to V0 for Gemini compatibility
- Fix client capabilities to use camelCase format with meta fields
- Add retry configuration (2 retries) for ACP model providers
- Increase log tail from 50 to 150 lines for better error diagnostics

Switch integration tests to use mock-acp instead of gemini-acp for faster
iteration during development.

Note: Includes temporary debugging code (hardcoded mock-acp provider in
client.rs) that should be reverted before merging.
## Summary
- Adds PTY-based integration test framework for full binary TUI testing
- Implements visual regression testing using insta snapshots
- Provides test coverage for startup, input handling, prompt flow, and
cancellation flows
- Includes comprehensive documentation and Claude skills for PTY testing
patterns

## Key Features
- **PTY Test Harness**: Custom test framework that spawns the actual
binary in a pseudo-terminal for realistic testing
- **Visual Regression Tests**: Insta snapshot tests capture terminal
output for detecting UI regressions
- **Test Coverage**: Startup flows, input handling, prompt interactions,
and cancellation scenarios
- **Developer Tools**: PTY testing skill documentation to guide future
test development

## Technical Details
- Intercepts cursor position queries to prevent test hangs
- Non-blocking PTY reader implementation for reliable test execution
- Normalizes screen snapshots to handle variable content
- Supports multiple terminal dimensions for responsive UI testing

## Test Plan
- [x] All existing tests pass
- [x] New PTY integration tests pass with snapshots
- [x] Tests run reliably without timeouts or hangs
- [x] Documentation accurately describes the test framework
🤖 Generated with [Nori](https://nori.ai)\n\nCo-Authored-By: Nori <noreply@tilework.tech>

Change the ACP registry to use model names for lookups instead of provider names. This separates the model identifier (what the user specifies) from the provider identifier (which agent subprocess to use).

The provider field in AcpAgentConfig now uniquely identifies the agent subprocess, enabling future optimization to determine when an existing subprocess can be reused vs when a new one must be spawned when switching models.

Changes:
- Add provider field to AcpAgentConfig struct
- Change get_agent_config() to accept model_name instead of provider_name
- Update registry to map exact model names (e.g., "mock-model", "gemini-flash-2.0") to agent configurations
- Fix client.rs WireApi::Acp to pass self.config.model to registry
- Update all tests to use model names and verify provider field
- Update documentation to reflect model-centric architecture

The registry uses exact matching with case-insensitive normalization. Each model must be explicitly registered.
@CSRessel CSRessel marked this pull request as ready for review November 24, 2025 18:32
@CSRessel
Copy link
Copy Markdown
Collaborator Author

Got the first ACP call to Gemini working this morning, going to close the PR, rename the work to dev branch (as default+protected branch), and then carve out this messily coded ACP implementation for a cleaner replacement

image

@CSRessel
Copy link
Copy Markdown
Collaborator Author

Branched to new default branch, dev

@CSRessel CSRessel closed this Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants