Sync upstream rust-v0.63.0#67
Closed
CSRessel wants to merge 104 commits into
Closed
Conversation
- This PR is to make it on path for truncating by tokens. This path will be initially used by unified exec and context manager (responsible for MCP calls mainly). - We are exposing new config `calls_output_max_tokens` - Use `tokens` as the main budget unit but truncate based on the model family by Introducing `TruncationPolicy`. - Introduce `truncate_text` as a router for truncation based on the mode. In next PRs: - remove truncate_with_line_bytes_budget - Add the ability to the model to override the token budget.
- stop prompting users to install WSL - prompt users to turn on Windows sandbox when auto mode requested. <img width="1660" height="195" alt="Screenshot 2025-11-17 110612" src="https://github.com/user-attachments/assets/c67fc239-a227-417e-94bb-599a8ed8f11e" /> <img width="1684" height="168" alt="Screenshot 2025-11-17 110637" src="https://github.com/user-attachments/assets/d18c3370-830d-4971-8746-04757ae2f709" /> <img width="1655" height="293" alt="Screenshot 2025-11-17 110719" src="https://github.com/user-attachments/assets/d21f6ce9-c23e-4842-baf6-8938b77c16db" />
## Summary Enables shell_command as default for `gpt-5*` and `codex-*` models. ## Testing - [x] Updated unit tests
fix for openai/codex#6153 supports mTLS configuration and includes TLS features in the library build to enable secure HTTPS connections with custom root certificates. grpc: https://docs.rs/tonic/0.13.1/src/tonic/transport/channel/endpoint.rs.html#63 https: https://docs.rs/reqwest/0.12.23/src/reqwest/async_impl/client.rs.html#516
# External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.
…st (#6851) The `generated_ts_has_no_optional_nullable_fields` test was occasionally failing on slow CI nodes because of a timeout. This change reduces the work done by the test. It adds some "options" for the `generate_ts` function so it can skip work that's not needed for the test.
…e tree (#6853) This PR fixes the `release_event_does_not_change_selection` test so it doesn't cause an extra `config.toml` to be emitted in the sources when running the tests locally. Prior to this fix, I needed to delete this file every time I ran the tests to prevent it from showing up as an uncommitted source file.
The Windows sandbox did not previously support multiple workspace roots via config. Now it does
By default, show only sessions that shared a cwd with the current cwd. `--all` shows all sessions in all cwds. Also, show the branch name from the rollout metadata. <img width="1091" height="638" alt="Screenshot 2025-11-04 at 3 30 47 PM" src="https://github.com/user-attachments/assets/aae90308-6115-455f-aff7-22da5f1d9681" />
similar to logic in
`codex/codex-rs/exec/src/event_processor_with_jsonl_output.rs`.
translation of v1 -> v2 events:
`codex/event/task_complete` -> `turn/completed`
`codex/event/turn_aborted` -> `turn/completed` with `interrupted` status
`codex/event/error` -> `turn/completed` with `error` status
this PR also makes `items` field in `Turn` optional. For now, we only
populate it when we resume a thread, and leave it as None for all other
places until we properly rewrite core to keep track of items.
tested using the codex app server client. example new event:
```
< {
< "method": "turn/completed",
< "params": {
< "turn": {
< "id": "0",
< "items": [],
< "status": "interrupted"
< }
< }
< }
```
…(#6847)
This adds the following fields to `ThreadStartResponse` and
`ThreadResumeResponse`:
```rust
pub model: String,
pub model_provider: String,
pub cwd: PathBuf,
pub approval_policy: AskForApproval,
pub sandbox: SandboxPolicy,
pub reasoning_effort: Option<ReasoningEffort>,
```
This is important because these fields are optional in
`ThreadStartParams` and `ThreadResumeParams`, so the caller needs to be
able to determine what values were ultimately used to start/resume the
conversation. (Though note that any of these could be changed later
between turns in the conversation.)
Though to get this information reliably, it must be read from the
internal `SessionConfiguredEvent` that is created in response to the
start of a conversation. Because `SessionConfiguredEvent` (as defined in
`codex-rs/protocol/src/protocol.rs`) did not have all of these fields, a
number of them had to be added as part of this PR.
Because `SessionConfiguredEvent` is referenced in many tests, test
instances of `SessionConfiguredEvent` had to be updated, as well, which
is why this PR touches so many files.
New strings:
1. Approval mode picker just says "Select Approval Mode"
1. Updated "Auto" to "Agent"
1. When you select "Agent", you get "Agent mode on Windows uses an
experimental sandbox to limit network and filesystem access. [Learn
more]"
1. Updated world-writable warning to "The Windows sandbox cannot protect
writes to folders that are writable by Everyone. Consider removing write
access for Everyone from the following folders: {folders}"
---------
Co-authored-by: iceweasel-oai <iceweasel@openai.com>
- Testing: None
# External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.
## Summary - avoid surfacing ghost snapshot warnings in the TUI when snapshot creation fails, logging the conditions instead - continue to capture successful ghost snapshots without changing existing behavior ## Testing - `cargo test -p codex-core` *(fails: default_client::tests::test_create_client_sets_default_headers, default_client::tests::test_get_codex_user_agent, exec::tests::kill_child_process_group_kills_grandchildren_on_timeout)* ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_691c02238db08322927c47b8c2d72c4c)
Instead of returning structured out and then re-formatting it into freeform, return the freeform output from shell_command tool. Keep `shell` as the default tool for GPT-5.
# External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.
Move shell to use the configurable `truncate_text` --------- Co-authored-by: pakrym-oai <pakrym@openai.com>
adding execpolicycheck tool onto codex cli this is useful for validating policies (can be multiple) against commands. it will also surface errors in policy syntax: <img width="1150" height="281" alt="Screenshot 2025-11-19 at 12 46 21 PM" src="https://github.com/user-attachments/assets/8f99b403-564c-4172-acc9-6574a8d13dc3" /> this PR also changes output format when there's no match in the CLI. instead of returning the raw string `noMatch`, we return `{"noMatch":{}}` this PR is a rewrite of: openai/codex#6932 (due to the numerous merge conflicts present in the original PR) --------- Co-authored-by: Michael Bolin <mbolin@openai.com>
looks like it sometimes flake around 30. let's give it more time.
…(#7029) second attempt to fix this test after openai/codex#6884. I think this flakiness is happening because yield_time is too small for a 10,000 step loop in python.
Our Restricted Token contains 3 SIDs (Logon, Everyone, {WorkspaceWrite
Capability || ReadOnly Capability})
because it must include Everyone, that left us vulnerable to directories
that allow writes to Everyone. Even though those directories do not have
ACEs that enable our capability SIDs to write to them, they could still
be written to even in ReadOnly mode, or even in WorkspaceWrite mode if
they are outside of a writable root.
A solution to this is to explicitly add *Deny* ACEs to these
directories, always for the ReadOnly Capability SID, and for the
WorkspaceWrite SID if the directory is outside of a workspace root.
Under a restricted token, Windows always checks Deny ACEs before Allow
ACEs so even though our restricted token would allow a write to these
directories due to the Everyone SID, it fails first because of the Deny
ACE on the capability SID
…error events (#6938)
This PR does two things:
1. populate a new `codex_error_code` protocol in error events sent from
core to client;
2. old v1 core events `codex/event/stream_error` and `codex/event/error`
will now both become `error`. We also show codex error code for
turncompleted -> error status.
new events in app server test:
```
< {
< "method": "codex/event/stream_error",
< "params": {
< "conversationId": "019aa34c-0c14-70e0-9706-98520a760d67",
< "id": "0",
< "msg": {
< "codex_error_code": {
< "response_stream_disconnected": {
< "http_status_code": 401
< }
< },
< "message": "Reconnecting... 2/5",
< "type": "stream_error"
< }
< }
< }
{
< "method": "error",
< "params": {
< "error": {
< "codexErrorCode": {
< "responseStreamDisconnected": {
< "httpStatusCode": 401
< }
< },
< "message": "Reconnecting... 2/5"
< }
< }
< }
< {
< "method": "turn/completed",
< "params": {
< "turn": {
< "error": {
< "codexErrorCode": {
< "responseTooManyFailedAttempts": {
< "httpStatusCode": 401
< }
< },
< "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a1b495a1a97ed3e-SJC"
< },
< "id": "0",
< "items": [],
< "status": "failed"
< }
< }
< }
```
- Use /bin/sh instead of /bin/bash on FreeBSD/OpenBSD in the process group timeout test to avoid command-not-found failures. - Accept /usr/local/bin/bash as a valid SHELL path to match common FreeBSD installations. - Switch the shell serialization duration test to /bin/sh for improved portability across Unix platforms. With this change, `cargo test -p codex-core --lib` runs and passes on FreeBSD.
Maybe it solved flakiness
…6972) This updates `ExecParams` so that instead of taking `timeout_ms: Option<u64>`, it now takes a more general cancellation mechanism, `ExecExpiration`, which is an enum that includes a `Cancellation(tokio_util::sync::CancellationToken)` variant. If the cancellation token is fired, then `process_exec_tool_call()` returns in the same way as if a timeout was exceeded. This is necessary so that in #6973, we can manage the timeout logic external to the `process_exec_tool_call()` because we want to "suspend" the timeout when an elicitation from a human user is pending. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6972). * #7005 * #6973 * __->__ #6972
I am suspecting this is flaky because of the wall time can become 0, 0.1, or 1.
…l timeout (#6973) Previously, we were running into an issue where we would run the `shell` tool call with a timeout of 10s, but it fired an elicitation asking for user approval, the time the user took to respond to the elicitation was counted agains the 10s timeout, so the `shell` tool call would fail with a timeout error unless the user is very fast! This PR addresses this issue by introducing a "stopwatch" abstraction that is used to manage the timeout. The idea is: - `Stopwatch::new()` is called with the _real_ timeout of the `shell` tool call. - `process_exec_tool_call()` is called with the `Cancellation` variant of `ExecExpiration` because it should not manage its own timeout in this case - the `Stopwatch` expiration is wired up to the `cancel_rx` passed to `process_exec_tool_call()` - when an elicitation for the `shell` tool call is received, the `Stopwatch` pauses - because it is possible for multiple elicitations to arrive concurrently, it keeps track of the number of "active pauses" and does not resume until that counter goes down to zero I verified that I can test the MCP server using `@modelcontextprotocol/inspector` and specify `git status` as the `command` with a timeout of 500ms and that the elicitation pops up and I have all the time in the world to respond whereas previous to this PR, that would not have been possible. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6973). * #7005 * __->__ #6973 * #6972
Document new codex error info. Also fixed the name from `codex_error_code` to `codex_error_info`.
## Summary - allow selection popups to request an initial highlighted row - begin the /models reasoning selector focused on the default effort ## Testing - just fmt - just fix -p codex-tui - cargo test -p codex-tui https://github.com/user-attachments/assets/b322aeb1-e8f3-4578-92f7-5c2fa5ee4c98 ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_691f75e8fc188322a910fbe2138666ef)
### Summary After #7022, we no longer need this warning. We should also clean up the schema for the notification, but this is a quick fix to just stop the behavior in the VSCE ## Testing - [x] Ran locally
New shell implementation always uses full paths.
Fixes the
```
{
"error": {
"message": "Invalid value: 'other'. Supported values are: 'search', 'open_page', and 'find_in_page'.",
"type": "invalid_request_error",
"param": "input[150].action.type",
"code": "invalid_value"
}
```
error.
The actual-actual fix here is supporting absent `query` parameter.
…it rewrite rules (#7068) This fixes openai/codex#7044
Request param `max_output_tokens` is documented in `https://github.com/openai/codex/blob/main/docs/config.md`, but nowhere uses the item in config, this commit read it from config for GPT responses API. see openai/codex#4138 for issue report. Signed-off-by: Yorling <shallowcloud@yeah.net>
It stopped working (found zero duplicates) starting three days ago when the model was switched from `gpt-5` to `gpt-5.1`. I'm not sure why it stopped working. This is an attempt to get it working again by using the default model for the codex action (which is presumably `gpt-5.1-codex-max`).
This adds a GitHub workflow for building a new npm module we are experimenting with that contains an MCP server for running Bash commands. The new workflow, `shell-tool-mcp`, is a dependency of the general `release` workflow so that we continue to use one version number for all artifacts across the project in one GitHub release. `.github/workflows/shell-tool-mcp.yml` is the primary workflow introduced by this PR, which does the following: - builds the `codex-exec-mcp-server` and `codex-execve-wrapper` executables for both arm64 and x64 versions of Mac and Linux (preferring the MUSL version for Linux) - builds Bash (dynamically linked) for a [comically] large number of platforms (both x64 and arm64 for most) with a small patch specified by `shell-tool-mcp/patches/bash-exec-wrapper.patch`: - `debian-11` - `debian-12` - `ubuntu-20.04` - `ubuntu-22.04` - `ubuntu-24.04` - `centos-9` - `macos-13` (x64 only) - `macos-14` (arm64 only) - `macos-15` (arm64 only) - builds the TypeScript for the [new] Node module declared in the `shell-tool-mcp/` folder, which creates `bin/mcp-server.js` - adds all of the native binaries to `shell-tool-mcp/vendor/` folder; `bin/mcp-server.js` does a runtime check to determine which ones to execute - uses `npm pack` to create the `.tgz` for the module - if `publish: true` is set, invokes the `npm publish` call with the `.tgz` The justification for building Bash for so many different operating systems is because, since it is dynamically linked, we want to increase our confidence that the version we build is compatible with the glibc whatever OS we end up running on. (Note this is less of a concern with `codex-exec-mcp-server` and `codex-execve-wrapper` on Linux, as they are statically linked.) This PR also introduces the code for the npm module in `shell-tool-mcp/` (the proposed module name is `@openai/codex-shell-tool-mcp`). Initially, I intended the module to be a single file of vanilla JavaScript (like [`codex-cli/bin/codex.js`](https://github.com/openai/codex/blob/ab5972d447da78d3e4dd8461cf7d43a22e5d2acb/codex-cli/bin/codex.js)), but some of the logic seemed a bit tricky, so I decided to port it to TypeScript and add unit tests. `shell-tool-mcp/src/index.ts` defines the `main()` function for the module, which performs runtime checks to determine the clang triple to find the path to the Rust executables within the `vendor/` folder (`resolveTargetTriple()`). It uses a combination of `readOsRelease()` and `resolveBashPath()` to determine the correct Bash executable to run in the environment. Ultimately, it spawns a command like the following: ``` codex-exec-mcp-server \ --execve codex-execve-wrapper \ --bash custom-bash "$@" ``` Note `.github/workflows/shell-tool-mcp-ci.yml` defines a fairly standard CI job for the module (`format`/`build`/`test`). To test this PR, I pushed this branch to my personal fork of Codex and ran the CI job there: https://github.com/bolinfest/codex/actions/runs/19564311320 Admittedly, the graph looks a bit wild now: <img width="5115" height="2969" alt="Screenshot 2025-11-20 at 11 44 58 PM" src="https://github.com/user-attachments/assets/cc5ef306-efc1-4ed7-a137-5347e394f393" /> But when it finished, I was able to download `codex-shell-tool-mcp-npm` from the **Artifacts** for the workflow in an empty temp directory, unzip the `.zip` and then the `.tgz` inside it, followed by `xattr -rc .` to remove the quarantine bits. Then I ran: ```shell npx @modelcontextprotocol/inspector node /private/tmp/foobar4/package/bin/mcp-server.js ``` which launched the MCP Inspector and I was able to use it as expected! This bodes well that this should work once the package is published to npm: ```shell npx @modelcontextprotocol/inspector npx @openai/codex-shell-tool-mcp ``` Also, to verify the package contains what I expect: ```shell /tmp/foobar4/package$ tree . ├── bin │ └── mcp-server.js ├── package.json ├── README.md └── vendor ├── aarch64-apple-darwin │ ├── bash │ │ ├── macos-14 │ │ │ └── bash │ │ └── macos-15 │ │ └── bash │ ├── codex-exec-mcp-server │ └── codex-execve-wrapper ├── aarch64-unknown-linux-musl │ ├── bash │ │ ├── centos-9 │ │ │ └── bash │ │ ├── debian-11 │ │ │ └── bash │ │ ├── debian-12 │ │ │ └── bash │ │ ├── ubuntu-20.04 │ │ │ └── bash │ │ ├── ubuntu-22.04 │ │ │ └── bash │ │ └── ubuntu-24.04 │ │ └── bash │ ├── codex-exec-mcp-server │ └── codex-execve-wrapper ├── x86_64-apple-darwin │ ├── bash │ │ └── macos-13 │ │ └── bash │ ├── codex-exec-mcp-server │ └── codex-execve-wrapper └── x86_64-unknown-linux-musl ├── bash │ ├── centos-9 │ │ └── bash │ ├── debian-11 │ │ └── bash │ ├── debian-12 │ │ └── bash │ ├── ubuntu-20.04 │ │ └── bash │ ├── ubuntu-22.04 │ │ └── bash │ └── ubuntu-24.04 │ └── bash ├── codex-exec-mcp-server └── codex-execve-wrapper 26 directories, 26 files ```
Add a `Declined` status for when we request an approval from the user and the user declines. This allows us to distinguish from commands that actually ran, but failed. This behaves similarly to apply_patch / FileChange, which does the same thing.
…103) openai/codex#7005 introduced a new part of the release process that added multiple files named `bash` in the `dist/` folder used as the basis of the GitHub Release. I believe that all file names in a GitHub Release have to be unique, which is why the recent release build failed: https://github.com/openai/codex/actions/runs/19577669780/job/56070183504 Based on the output of the **List** step, I believe these are the appropriate artifacts to delete as a quick fix.
Collaborator
Author
|
Synced with #66 v0.64.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Upstream Sync
This PR syncs changes from upstream release
rust-v0.63.0.Summary
rust-v0.63.0Workflow Sanitization
The following upstream workflows had their triggers replaced with `workflow_dispatch`:
ci.ymlcla.ymlclose-stale-contributor-prs.ymlcodespell.ymlissue-deduplicator.ymlissue-labeler.ymlrust-release.ymlsdk.ymlshell-tool-mcp-ci.ymlshell-tool-mcp.ymlMerge Instructions
git checkout dev git merge sync/upstream-v0.63.0 --no-ff # Resolve conflicts if anycd codex-rs && cargo testcargo insta reviewAfter Merge
sync/upstream-v0.63.0branch