Skip to content

Commit 5dc1e97

Browse files
CSResselnori-agent
andcommitted
fix(acp): Prevent race condition during agent switching
When switching agents, events from the OLD agent could leak into the NEW widget before SessionConfigured was received, causing the UI to show responses from the wrong agent. Fix: - Add expected_model field to ChatWidgetInit to specify expected model - Add session_configured_received flag to ChatWidget - Filter events in handle_codex_event() until SessionConfigured arrives with matching model name Also: - Add E2E tests for agent switch message flow - Remove unused ClearPendingAgent event and has_pending_agent method - Fix clippy warnings for dead code 🤖 Generated with [Nori](https://nori.ai) Co-Authored-By: Nori <contact@tilework.tech>
1 parent 9646bf0 commit 5dc1e97

9 files changed

Lines changed: 344 additions & 40 deletions

File tree

codex-rs/tui-pty-e2e/docs.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ This delay allows the PTY subprocess time to process input and update the displa
145145
| `@/codex-rs/tui-pty-e2e/tests/input_handling.rs` | Text editing, backspace, Ctrl-C clearing, arrow key navigation with snapshot testing |
146146
| `@/codex-rs/tui-pty-e2e/tests/streaming.rs` | Prompt submission with timing delays, agent response streaming |
147147
| `@/codex-rs/tui-pty-e2e/tests/acp_mode.rs` | ACP mode startup, response flow, and approval bridging - validates TUI works with ACP wire API and mock agent; includes test for permission request display |
148-
| `@/codex-rs/tui-pty-e2e/tests/agent_switching.rs` | ACP agent subprocess lifecycle - verifies subprocess spawning, cleanup on session switch, and different agents use different processes (Linux only) |
148+
| `@/codex-rs/tui-pty-e2e/tests/agent_switching.rs` | ACP agent subprocess lifecycle and event isolation - verifies subprocess spawning, cleanup on session switch, different agents use different processes, and event filtering prevents cross-agent contamination (Linux only) |
149149
| `@/codex-rs/tui-pty-e2e/tests/live_acp.rs` | Live authenticated ACP tests for Gemini and Claude with real API connections (opt-in, marked `#[ignore]`) |
150150

151151
**Snapshot Files:**
@@ -237,7 +237,9 @@ See `@/codex-rs/mock-acp-agent/docs.md` for full list of env vars.
237237

238238
**Agent Subprocess Lifecycle Testing (`agent_switching.rs`):**
239239

240-
Linux-only tests that verify ACP subprocess lifecycle management by parsing the `.codex-acp.log` file for PID entries:
240+
Linux-only tests that verify ACP subprocess lifecycle management and event isolation:
241+
242+
*Subprocess Management Tests:*
241243
- `acp_log_path()` method on `TuiSession` returns the path to the ACP tracing log file
242244
- Tests extract PIDs from log lines matching `"ACP agent spawned (pid: Some(...))"`
243245
- Uses `/proc/{pid}` filesystem to verify process existence and zombie state
@@ -248,6 +250,11 @@ Linux-only tests that verify ACP subprocess lifecycle management by parsing the
248250
- Cleanup happens when session switches, not when individual prompt turns end
249251
- Different models (`mock-model` vs `mock-model-alt`) spawn different subprocesses
250252

253+
*Event Isolation Tests:*
254+
- `extract_agent_messages_from_log()` helper parses `Mock agent:` log entries from ACP log file
255+
- `test_agent_switch_message_flow_mock_to_mock_alt` verifies that after switching agents, the NEW agent receives and responds to prompts (catches race conditions where OLD agent events could leak)
256+
- `test_agent_switch_logs_correct_sequence` verifies the expected log sequence during agent switch: agent receives prompt, logs receipt, sends response
257+
251258
**Binary Discovery:**
252259

253260
`codex_binary_path()` locates the compiled binary:

codex-rs/tui-pty-e2e/tests/agent_switching.rs

Lines changed: 241 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -707,3 +707,244 @@ fn test_agent_cleanup_after_switch_on_prompt() {
707707
initial_pid
708708
);
709709
}
710+
711+
// ============================================================================
712+
// Test: Agent Switch Message Flow - Verifies NEW agent receives and responds
713+
// ============================================================================
714+
715+
/// Helper to extract agent messages from log file
716+
/// Each mock agent logs to stderr which is captured in the ACP log
717+
fn extract_agent_messages_from_log(log_path: &std::path::Path) -> Vec<String> {
718+
std::fs::read_to_string(log_path)
719+
.unwrap_or_default()
720+
.lines()
721+
.filter(|line| {
722+
line.contains("Mock agent:")
723+
|| line.contains("cancel")
724+
|| line.contains("shutdown")
725+
|| line.contains("prompt")
726+
})
727+
.map(|s| s.to_string())
728+
.collect()
729+
}
730+
731+
/// Test that when switching agents via /agent command, the NEW agent
732+
/// correctly receives and responds to the submitted prompt.
733+
///
734+
/// This test explicitly verifies the message flow:
735+
/// 1. OLD agent should receive a cancel/shutdown signal
736+
/// 2. NEW agent should receive a new_session request
737+
/// 3. NEW agent should receive the prompt and respond
738+
/// 4. Response from NEW agent appears on screen
739+
///
740+
/// This catches the race condition bug where events from the OLD agent
741+
/// could leak into the NEW widget, causing the prompt to be lost.
742+
#[test]
743+
#[cfg(target_os = "linux")]
744+
fn test_agent_switch_message_flow_mock_to_mock_alt() {
745+
// Use default response (Test message 1/2) - both agents will use this
746+
let config = SessionConfig::new().with_model("mock-model".to_string());
747+
748+
let mut session = TuiSession::spawn_with_config(24, 80, config).expect("Failed to spawn TUI");
749+
750+
session
751+
.wait_for_text("›", TIMEOUT)
752+
.expect("TUI should start");
753+
std::thread::sleep(TIMEOUT_INPUT);
754+
755+
let log_path = session.acp_log_path().expect("Should have log path");
756+
757+
// First, verify initial agent works - send a prompt
758+
session.send_str("test initial").unwrap();
759+
std::thread::sleep(TIMEOUT_INPUT);
760+
session.send_key(Key::Enter).unwrap();
761+
762+
// Wait for initial agent response (default response)
763+
session
764+
.wait_for_text("Test message", Duration::from_secs(5))
765+
.expect("Initial agent should respond");
766+
767+
// Log messages before switch
768+
let msgs_before_switch = extract_agent_messages_from_log(&log_path);
769+
eprintln!("Messages before switch: {:?}", msgs_before_switch);
770+
771+
// Open agent picker with /agent command
772+
session.send_str("/agent").unwrap();
773+
std::thread::sleep(TIMEOUT_INPUT);
774+
session.send_key(Key::Enter).unwrap();
775+
std::thread::sleep(TIMEOUT_INPUT);
776+
777+
// Wait for agent picker to appear
778+
session
779+
.wait_for(
780+
|screen| screen.contains("Select Agent") || screen.contains("mock-model"),
781+
Duration::from_secs(3),
782+
)
783+
.expect("Agent picker should appear");
784+
785+
// Select mock-model-alt (different agent)
786+
session.send_key(Key::Down).unwrap();
787+
std::thread::sleep(TIMEOUT_INPUT);
788+
session.send_key(Key::Enter).unwrap();
789+
std::thread::sleep(Duration::from_millis(500));
790+
791+
// Messages after selection (but before prompt submission)
792+
let msgs_after_selection = extract_agent_messages_from_log(&log_path);
793+
eprintln!("Messages after selection: {:?}", msgs_after_selection);
794+
795+
// Now submit a prompt - this should trigger the actual agent switch
796+
// The NEW agent (mock-model-alt) should receive this prompt and respond
797+
session.send_str("test after switch").unwrap();
798+
std::thread::sleep(TIMEOUT_INPUT);
799+
session.send_key(Key::Enter).unwrap();
800+
801+
// Wait for the NEW agent's response to appear.
802+
// The key verification: we should see TWO instances of "Test message" -
803+
// one from the first prompt, and one from the second prompt after switch.
804+
// If the switch fails, the second response won't appear.
805+
std::thread::sleep(Duration::from_secs(3)); // Give time for response
806+
807+
// Log messages after prompt submission
808+
let msgs_after_prompt = extract_agent_messages_from_log(&log_path);
809+
eprintln!("Messages after prompt submission: {:?}", msgs_after_prompt);
810+
811+
// Verify we got two prompt calls (one before switch, one after)
812+
let prompt_count = msgs_after_prompt
813+
.iter()
814+
.filter(|m| m.contains("Mock agent: prompt"))
815+
.count();
816+
817+
if prompt_count < 2 {
818+
let screen = session.screen_contents();
819+
panic!(
820+
"Expected 2 prompt calls (before and after switch), got {}.\n\
821+
Screen contents: {}\n\
822+
Agent messages in log: {:?}",
823+
prompt_count, screen, msgs_after_prompt
824+
);
825+
}
826+
827+
// Verify message flow in logs:
828+
// 1. Should see "Mock agent: new_session" for the NEW agent
829+
// 2. Should see "Mock agent: prompt" for the NEW agent
830+
let has_new_session = msgs_after_prompt
831+
.iter()
832+
.filter(|m| m.contains("new_session"))
833+
.count()
834+
>= 2; // Initial + after switch
835+
836+
assert!(
837+
has_new_session,
838+
"Should have new_session calls for both agents, messages: {:?}",
839+
msgs_after_prompt
840+
);
841+
assert!(
842+
prompt_count >= 2,
843+
"Should have prompt calls for both agents, messages: {:?}",
844+
msgs_after_prompt
845+
);
846+
847+
// Final verification: the screen should show response content
848+
let screen = session.screen_contents();
849+
assert!(
850+
screen.contains("Test message"),
851+
"Screen should contain response text. Screen:\n{}",
852+
screen
853+
);
854+
}
855+
856+
/// Test that verifies the expected sequence of operations when switching agents
857+
/// This is a more focused test that checks specific message ordering
858+
#[test]
859+
#[cfg(target_os = "linux")]
860+
fn test_agent_switch_logs_correct_sequence() {
861+
let config = SessionConfig::new().with_model("mock-model".to_string());
862+
863+
let mut session = TuiSession::spawn_with_config(24, 80, config).expect("Failed to spawn TUI");
864+
865+
session
866+
.wait_for_text("›", TIMEOUT)
867+
.expect("TUI should start");
868+
std::thread::sleep(TIMEOUT_INPUT);
869+
870+
let log_path = session.acp_log_path().expect("Should have log path");
871+
let initial_pids = extract_mock_agent_pids_from_log(&log_path);
872+
assert!(!initial_pids.is_empty(), "Should have initial PID");
873+
874+
// Select new agent via /agent
875+
session.send_str("/agent").unwrap();
876+
std::thread::sleep(TIMEOUT_INPUT);
877+
session.send_key(Key::Enter).unwrap();
878+
std::thread::sleep(TIMEOUT_INPUT);
879+
880+
session
881+
.wait_for(
882+
|screen| screen.contains("Select Agent"),
883+
Duration::from_secs(3),
884+
)
885+
.expect("Agent picker should appear");
886+
887+
session.send_key(Key::Down).unwrap();
888+
std::thread::sleep(TIMEOUT_INPUT);
889+
session.send_key(Key::Enter).unwrap();
890+
std::thread::sleep(Duration::from_millis(300));
891+
892+
// Submit prompt to trigger switch
893+
session.send_str("trigger").unwrap();
894+
std::thread::sleep(TIMEOUT_INPUT);
895+
session.send_key(Key::Enter).unwrap();
896+
897+
// Wait for response
898+
session
899+
.wait_for_text("Test message", Duration::from_secs(10))
900+
.expect("Should see response from new agent");
901+
902+
// Parse the log to verify sequence
903+
let log_content = std::fs::read_to_string(&log_path).unwrap_or_default();
904+
905+
// Count agent spawns - should be 2 (initial + after switch)
906+
let spawn_count = log_content
907+
.lines()
908+
.filter(|l| l.contains("ACP agent spawned"))
909+
.count();
910+
911+
assert!(
912+
spawn_count >= 2,
913+
"Should spawn at least 2 agents (initial + after switch), got: {}. Log:\n{}",
914+
spawn_count,
915+
log_content
916+
);
917+
918+
// Verify new_session and prompt sequence
919+
let agent_messages: Vec<&str> = log_content
920+
.lines()
921+
.filter(|l| l.contains("Mock agent:"))
922+
.collect();
923+
924+
eprintln!("Agent message sequence:");
925+
for (i, msg) in agent_messages.iter().enumerate() {
926+
eprintln!(" {}: {}", i, msg);
927+
}
928+
929+
// Should have: initialize, new_session, prompt (first agent)
930+
// Then: initialize, new_session, prompt (second agent)
931+
let new_session_count = agent_messages
932+
.iter()
933+
.filter(|m| m.contains("new_session"))
934+
.count();
935+
let prompt_count = agent_messages
936+
.iter()
937+
.filter(|m| m.contains("prompt"))
938+
.count();
939+
940+
assert!(
941+
new_session_count >= 2,
942+
"Should have at least 2 new_session calls, got: {}",
943+
new_session_count
944+
);
945+
assert!(
946+
prompt_count >= 1,
947+
"Should have at least 1 prompt call, got: {}",
948+
prompt_count
949+
);
950+
}

codex-rs/tui/docs.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,19 @@ Most event types (exec begin/end, MCP calls, elicitation) are queued during acti
122122
- The `InterruptManager` still contains `ExecApproval` and `ApplyPatchApproval` variants for completeness, but these methods are marked `#[allow(dead_code)]`
123123
- `on_task_complete()` calls `flush_interrupt_queue()` for any remaining queued items
124124

125+
**Agent Switch Event Filtering:**
126+
127+
When switching between ACP agents (e.g., via `/agent` command), `ChatWidget` uses an event filtering mechanism to prevent race conditions:
128+
129+
- `expected_model: Option<String>` in `ChatWidgetInit` specifies which model the widget expects
130+
- `session_configured_received: bool` tracks whether `SessionConfigured` has arrived from the expected model
131+
- When `expected_model` is set, `handle_codex_event()` filters events:
132+
- All events are ignored until `SessionConfigured` arrives
133+
- `SessionConfigured` is only accepted if `event.model` matches `expected_model` (case-insensitive)
134+
- Once matching `SessionConfigured` arrives, `session_configured_received` is set to `true` and normal event processing resumes
135+
- This prevents the OLD agent's final events (completion, shutdown) from being processed by the NEW agent's widget
136+
- Fresh sessions, resumed sessions, and `/new` command use `expected_model: None` (no filtering)
137+
125138
**Color System:**
126139

127140
The `color.rs` and `terminal_palette.rs` modules handle terminal color detection and theming. The app queries terminal colors at startup for theme adaptation.

codex-rs/tui/src/app.rs

Lines changed: 9 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ use crate::history_cell::HistoryCell;
1010
use crate::model_migration::ModelMigrationOutcome;
1111
use crate::model_migration::migration_copy_for_config;
1212
use crate::model_migration::run_model_migration_prompt;
13+
use crate::nori::agent_picker::PendingAgentSelection;
1314
use crate::pager_overlay::Overlay;
1415
use crate::render::highlight::highlight_bash_to_lines;
1516
use crate::render::renderable::Renderable;
@@ -234,15 +235,6 @@ pub(crate) struct App {
234235
pending_agent: Option<PendingAgentSelection>,
235236
}
236237

237-
/// Information about a pending agent switch.
238-
#[derive(Debug, Clone)]
239-
pub(crate) struct PendingAgentSelection {
240-
/// The model name of the selected agent (e.g., "mock-model", "gemini-2.5-flash")
241-
pub model_name: String,
242-
/// The display name for the status indicator
243-
pub display_name: String,
244-
}
245-
246238
impl App {
247239
async fn shutdown_current_conversation(&mut self) {
248240
if let Some(conversation_id) = self.chat_widget.conversation_id() {
@@ -307,6 +299,7 @@ impl App {
307299
enhanced_keys_supported,
308300
auth_manager: auth_manager.clone(),
309301
feedback: feedback.clone(),
302+
expected_model: None, // No filtering for fresh sessions
310303
};
311304
ChatWidget::new(init, conversation_manager.clone())
312305
}
@@ -330,6 +323,7 @@ impl App {
330323
enhanced_keys_supported,
331324
auth_manager: auth_manager.clone(),
332325
feedback: feedback.clone(),
326+
expected_model: None, // No filtering for resumed sessions
333327
};
334328
ChatWidget::new_from_existing(
335329
init,
@@ -484,6 +478,7 @@ impl App {
484478
enhanced_keys_supported: self.enhanced_keys_supported,
485479
auth_manager: self.auth_manager.clone(),
486480
feedback: self.feedback.clone(),
481+
expected_model: None, // No filtering for /new command
487482
};
488483
self.chat_widget = ChatWidget::new(init, self.server.clone());
489484
if let Some(summary) = summary {
@@ -916,19 +911,11 @@ impl App {
916911
);
917912
self.chat_widget.add_info_message(
918913
format!(
919-
"Agent '{}' selected. Will switch on next prompt submission.",
920-
display_name
914+
"Agent '{display_name}' selected. Will switch on next prompt submission."
921915
),
922916
None,
923917
);
924918
}
925-
AppEvent::ClearPendingAgent => {
926-
if self.pending_agent.is_some() {
927-
tracing::info!("Pending agent selection cleared");
928-
self.pending_agent = None;
929-
self.chat_widget.clear_pending_agent();
930-
}
931-
}
932919
AppEvent::SubmitWithAgentSwitch {
933920
model_name,
934921
display_name,
@@ -951,6 +938,7 @@ impl App {
951938
self.shutdown_current_conversation().await;
952939

953940
// Create the new chat widget with the new config and the message as initial prompt
941+
// Set expected_model to filter events from the OLD agent until SessionConfigured
954942
let init = crate::chatwidget::ChatWidgetInit {
955943
config: self.config.clone(),
956944
frame_requester: tui.frame_requester(),
@@ -960,13 +948,12 @@ impl App {
960948
enhanced_keys_supported: self.enhanced_keys_supported,
961949
auth_manager: self.auth_manager.clone(),
962950
feedback: self.feedback.clone(),
951+
expected_model: Some(model_name.clone()),
963952
};
964953
self.chat_widget = ChatWidget::new(init, self.server.clone());
965954

966-
self.chat_widget.add_info_message(
967-
format!("Switched to agent: {}", display_name),
968-
None,
969-
);
955+
self.chat_widget
956+
.add_info_message(format!("Switched to agent: {display_name}"), None);
970957
}
971958
}
972959
Ok(true)

codex-rs/tui/src/app_backtrack.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -347,6 +347,7 @@ impl App {
347347
enhanced_keys_supported: self.enhanced_keys_supported,
348348
auth_manager: self.auth_manager.clone(),
349349
feedback: self.feedback.clone(),
350+
expected_model: None, // No filtering for backtracked conversations
350351
};
351352
self.chat_widget =
352353
crate::chatwidget::ChatWidget::new_from_existing(init, conv, session_configured);

0 commit comments

Comments
 (0)