merge from t3code #2626
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b999587b5b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| window.sessionStorage.setItem("t3code:voice-autostart", "1"); | ||
| void handleNewThread(defaultProjectRef, { envMode: defaultThreadEnvMode }); |
There was a problem hiding this comment.
Cancel voice autostart after an early release
When push-to-talk is started with no active thread, this flag is set before handleNewThread finishes mounting the composer. If the user releases Ctrl+Option while the thread is still being created, the route-level listener ignores the stop event (it returns unless event.type === "start"), but the new composer still consumes t3code:voice-autostart and starts recording with no remaining stop event, leaving voice capture active until the user manually stops it.
Useful? React with 👍 / 👎.
| } | ||
|
|
||
| try { | ||
| const stream = await navigator.mediaDevices.getUserMedia({ audio: true }); |
There was a problem hiding this comment.
Handle releases during microphone startup
For an existing composer, releasing the push-to-talk keys before getUserMedia resolves is lost because stopVoiceListening() only stops voiceRecorderRef.current, which is not assigned until after this await. On a permission prompt or slow device, the recorder can start after the keys are already up and then never receives a stop event, so the mic remains recording until another manual toggle.
Useful? React with 👍 / 👎.
| ? "audio/webm" | ||
| : ""; | ||
| const chunks: Blob[] = []; | ||
| const recorder = new MediaRecorder(stream, mimeType ? { mimeType } : undefined); |
There was a problem hiding this comment.
🟠 High chat/ChatComposer.tsx:1162
If new MediaRecorder(stream, ...) throws an exception (e.g., NotSupportedError), the stream obtained from getUserMedia leaks because the ref hasn't been set yet and the catch block only cleans up voiceRecorderRef.current?.stream. The stream tracks are never stopped in this error path.
🤖 Copy this AI Prompt to have your agent fix this:
In file apps/web/src/components/chat/ChatComposer.tsx around line 1162:
If `new MediaRecorder(stream, ...)` throws an exception (e.g., `NotSupportedError`), the `stream` obtained from `getUserMedia` leaks because the ref hasn't been set yet and the catch block only cleans up `voiceRecorderRef.current?.stream`. The stream tracks are never stopped in this error path.
Evidence trail:
apps/web/src/components/chat/ChatComposer.tsx lines 1156 (getUserMedia), 1163 (new MediaRecorder — potential throw site), 1165-1170 (voiceRecorderRef.current assignment — not yet executed), 1275 (catch block cleanup uses voiceRecorderRef.current?.stream which is null). Reviewed at REVIEWED_COMMIT.
| function registerPushToTalkShortcuts(): void { | ||
| if (!app.isReady()) { | ||
| return; | ||
| } | ||
| globalShortcut.unregister(PUSH_TO_TALK_GLOBAL_SHORTCUT); | ||
| const registered = globalShortcut.register(PUSH_TO_TALK_GLOBAL_SHORTCUT, () => { | ||
| if (pushToTalkActive) { | ||
| stopPushToTalkFromShortcut(); | ||
| } else { | ||
| startPushToTalkFromShortcut(); | ||
| } | ||
| }); | ||
| if (!registered) { | ||
| writeDesktopLogHeader( | ||
| `push-to-talk failed to register shortcut accelerator=${PUSH_TO_TALK_GLOBAL_SHORTCUT}`, | ||
| ); | ||
| } | ||
| showIdlePushToTalkOverlay(); | ||
| } |
There was a problem hiding this comment.
🟡 Medium src/main.ts:2657
showIdlePushToTalkOverlay() is called unconditionally at line 2674, even when globalShortcut.register returns false at line 2662. This displays "Hold Ctrl+Option" to users when the shortcut failed to register, so the on-screen instructions don't actually work. Consider only showing the overlay when registration succeeds.
function registerPushToTalkShortcuts(): void {
if (!app.isReady()) {
return;
}
globalShortcut.unregister(PUSH_TO_TALK_GLOBAL_SHORTCUT);
const registered = globalShortcut.register(PUSH_TO_TALK_GLOBAL_SHORTCUT, () => {
if (pushToTalkActive) {
stopPushToTalkFromShortcut();
} else {
startPushToTalkFromShortcut();
}
});
if (!registered) {
writeDesktopLogHeader(
`push-to-talk failed to register shortcut accelerator=${PUSH_TO_TALK_GLOBAL_SHORTCUT}`,
);
+ return;
}
showIdlePushToTalkOverlay();
}🤖 Copy this AI Prompt to have your agent fix this:
In file apps/desktop/src/main.ts around lines 2657-2675:
`showIdlePushToTalkOverlay()` is called unconditionally at line 2674, even when `globalShortcut.register` returns `false` at line 2662. This displays "Hold Ctrl+Option" to users when the shortcut failed to register, so the on-screen instructions don't actually work. Consider only showing the overlay when registration succeeds.
Evidence trail:
apps/desktop/src/main.ts lines 2657-2675 (at REVIEWED_COMMIT): `registerPushToTalkShortcuts()` function. `globalShortcut.register()` result stored in `registered` (line 2662), failure logged in `if (!registered)` block (lines 2669-2672), then `showIdlePushToTalkOverlay()` called unconditionally at line 2674.
ApprovabilityVerdict: Needs human review 2 blocking correctness issues found. This PR introduces a complete push-to-talk voice input feature including native macOS keyboard event handling, audio recording, local transcription, and MCP server configuration. The significant new capability combined with multiple unresolved bugs (stream leaks, race conditions, timing issues) identified in review comments warrants human review. You can customize Macroscope's approvability policy. Learn more. |
There was a problem hiding this comment.
Pull request overview
This PR expands Codex configuration support by adding MCP server definitions to server settings (and injecting them into Codex session config), and introduces a desktop push-to-talk voice flow (global shortcut + overlay + local transcription) integrated into the web UI.
Changes:
- Add
providers.codex.mcpServerssettings schema + patch support, plus docs and server-side session config injection. - Add desktop push-to-talk IPC surface, Electron main/preload implementation (overlay, native helper, local STT), and web UI hooks to start voice capture/new threads from push-to-talk events.
- Adjust Electron routing history behavior to use browser history in HTTP (dev) shells.
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/contracts/src/settings.ts | Adds Codex MCP server settings + patch schema. |
| packages/contracts/src/ipc.ts | Extends DesktopBridge contract with push-to-talk + transcription types/APIs. |
| docs/codex-mcp-integrations.md | Documents MCP server settings shape and placeholders. |
| apps/web/src/routes/_chat.tsx | Subscribes to push-to-talk events to auto-start a new thread for voice. |
| apps/web/src/main.tsx | Refines history strategy for Electron dev vs packaged shells. |
| apps/web/src/localApi.test.ts | Updates test desktop bridge stub for new methods. |
| apps/web/src/components/settings/SettingsPanels.browser.tsx | Updates browser stub for new desktop bridge methods. |
| apps/web/src/components/KeybindingsToast.browser.tsx | Updates base server config fixture to include mcpServers. |
| apps/web/src/components/chat/ChatComposer.tsx | Implements voice recording, transcription, overlay updates, and UI controls. |
| apps/server/src/serverSettings.test.ts | Adds coverage for normalizing MCP server updates. |
| apps/server/src/provider/Layers/ProviderInstanceRegistryLive.test.ts | Updates Codex config fixtures to include mcpServers. |
| apps/server/src/provider/Layers/CodexSessionRuntime.ts | Builds MCP session config and injects into Codex thread start params. |
| apps/server/src/provider/Layers/CodexSessionRuntime.test.ts | Adds tests for MCP session config rendering/interpolation. |
| apps/server/src/provider/Layers/CodexAdapter.ts | Passes MCP server settings into Codex session runtime options. |
| apps/server/src/provider/Layers/CodexAdapter.test.ts | Updates adapter validation test expectations for mcpServers. |
| apps/desktop/src/preload.ts | Exposes new push-to-talk/transcription IPC methods to the renderer. |
| apps/desktop/src/main.ts | Implements global shortcut, overlay window, native helper integration, and local STT transcription. |
| apps/desktop/scripts/build-native-helpers.mjs | Adds a build step to compile the macOS native helper. |
| apps/desktop/package.json | Wires native helper build into dev/build scripts. |
| apps/desktop/native/macos-push-to-talk-helper.swift | Adds macOS accessibility-based modifier detection helper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const targetWindow = mainWindow ?? BrowserWindow.getAllWindows()[0] ?? null; | ||
| targetWindow?.webContents.send(PUSH_TO_TALK_EVENT_CHANNEL, { type }); |
| <TooltipPopup side="top"> | ||
| {isVoiceSupported | ||
| ? "Hold Ctrl+Option to speak. Ctrl+Option+Space works globally." | ||
| : "Voice input is unavailable in this runtime."} | ||
| </TooltipPopup> |
| if (voiceRecorderRef.current || isVoiceListening) { | ||
| if (voiceRecorderRef.current) { | ||
| voiceRecorderRef.current.autoSubmit = | ||
| voiceRecorderRef.current.autoSubmit || options.autoSubmit; | ||
| } |
| updatePushToTalkOverlayPosition(); | ||
| pushToTalkOverlayFollowTimer = setInterval(updatePushToTalkOverlayPosition, 33); | ||
| pushToTalkOverlayFollowTimer.unref(); |
| const temporaryPath = `${modelPath}.${Crypto.randomUUID()}.tmp`; | ||
| const response = await fetch(LOCAL_STT_MODEL_URL); | ||
| if (!response.ok) { | ||
| throw new Error(`Failed to download local STT model: HTTP ${String(response.status)}.`); | ||
| } |
What Changed
Why
UI Changes
Checklist
Note
Add push-to-talk voice input and Codex MCP server configuration to the desktop app
start/stopevents, compiled automatically via a new build step.whisper-cliandffmpeg, with on-demand model download to the app state directory if not bundled or configured via environment variables.CodexSettingsandCodexSessionRuntimeto accept and forward MCP server definitions (with placeholder interpolation) into Codex thread start parameters.📊 Macroscope summarized b999587. 14 files reviewed, 4 issues evaluated, 0 issues filtered, 2 comments posted
🗂️ Filtered Issues