Skip to content

Commit 1f3b0c7

Browse files
committed
feat: add managed remote websocket app-server connections
- Added Codex.AppServer.connect_remote/2 to support managed remote websocket connections. This includes transport policy checks for safe websocket usage, auth-token transport validation, and support for auth_token/auth_token_env. - Realtime: Session now defers follow-up response.create requests until the current response finishes (response.done), preventing overlapping turns from causing premature creation. - Realtime: Diagnostics probe uses a schema-compatible turn and treats unknown_parameter-style schema drift as a protocol-incompatible skip reason rather than a hard failure. - OAuth: Canonicalize ChatGPT plan strings (e.g., "hc" -> "enterprise", "education" -> "edu") before SDK processing. - CLI: Interactive, resume, and fork helpers now support --remote and --remote-auth-token-env. Added support for websocket auth flags in Codex.CLI.app_server/1. - AppServer: Added experimentalFeature/enablement/set support.
1 parent b4bddb2 commit 1f3b0c7

22 files changed

Lines changed: 1858 additions & 49 deletions

CHANGELOG.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,31 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
### Added
11+
12+
- `Codex.AppServer.connect_remote/2` for managed remote websocket app-server
13+
connections, including auth-token transport policy checks and pid-compatible
14+
reuse of the existing request/helper surface.
15+
1016
### Changed
1117

1218
- Release notes and package-facing docs now call out the final Phase 4
1319
ownership boundary explicitly: `cli_subprocess_core` owns every
1420
subprocess-backed Codex lifecycle, while `codex_sdk` keeps Codex-native
1521
semantics such as app-server, MCP, realtime, and voice.
22+
- App-server and CLI parity now cover `experimentalFeature/enablement/set`,
23+
current websocket auth flags for `codex app-server`, root `--remote` /
24+
`--remote-auth-token-env` passthrough on interactive session wrappers, and
25+
`resume --include-non-interactive`.
26+
- Realtime diagnostics now use a schema-compatible probe and Realtime session
27+
sequencing now defers follow-up `response.create` calls until the active
28+
response completes.
29+
30+
### Fixed
31+
32+
- ChatGPT plan claims now normalize to the SDK's canonical lowercase strings,
33+
including `hc -> "enterprise"` and `education -> "edu"`, before auth/status
34+
structs and app-server external-auth payloads are built.
1635

1736
## [0.15.0] - 2026-03-19
1837

README.md

Lines changed: 62 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@ An idiomatic Elixir SDK for embedding OpenAI's Codex agent in your workflows and
2525
## Features
2626

2727
- **End-to-End Codex Lifecycle**: Spawn, resume, and manage full Codex threads with rich turn instrumentation.
28-
- **Multi-Transport Support**: Default `:exec` compatibility selector for the core-backed exec JSONL lane (`codex exec --json`) plus stateful app-server JSON-RPC over stdio (`codex app-server`) with multi-modal `UserInput` blocks.
29-
- **CLI Passthrough and PTY Sessions**: `Codex.CLI` can launch root `codex`, `cloud`, `completion`, `features`, `mcp`, `sandbox`, `resume`, `fork`, `app-server`, and other command-surface workflows directly.
28+
- **Multi-Transport Support**: Default `:exec` compatibility selector for the core-backed exec JSONL lane (`codex exec --json`) plus stateful app-server JSON-RPC via managed local stdio children or managed remote websockets.
29+
- **CLI Passthrough and PTY Sessions**: `Codex.CLI` can launch root `codex`, `cloud`, `completion`, `features`, `mcp`, `sandbox`, `resume`, `fork`, `app-server`, and other command-surface workflows directly, including remote-root and websocket-auth app-server flags.
3030
- **Native OAuth**: `Codex.OAuth` provides SDK-managed browser/device login, refresh, status, and logout with upstream-compatible `auth.json` persistence or memory-only sessions.
3131
- **Upstream Compatibility**: Mirrors Codex CLI flags (profile/OSS/full-auto/color/search/config overrides/review/resume) and handles app-server protocol drift (e.g. MCP list method rename fallbacks).
3232
- **Streaming & Structured Output**: Real-time events, per-thread output schemas, reasoning summary/content preservation, and typed app-server deltas.
@@ -107,6 +107,11 @@ Environment-aware OAuth behavior matches current native-app guidance:
107107
- SSH/headless/container environments prefer device code
108108
- CI and other non-interactive environments never auto-start login; existing credentials are used or the call fails clearly
109109

110+
ChatGPT plan types are normalized before they surface through SDK auth/status
111+
structs or app-server external-auth forwarding. In particular, `hc` and
112+
`enterprise` normalize to `"enterprise"`, while `education` and `edu`
113+
normalize to `"edu"`.
114+
110115
If `cli_auth_credentials_store = "keyring"` is set in config and keyring support is unavailable,
111116
the SDK logs a warning and skips file-based tokens (remote model fetch falls back to bundled models).
112117
When `cli_auth_credentials_store = "auto"` and keyring is unavailable, the SDK falls back to file-based auth.
@@ -254,6 +259,28 @@ tmp_home = Path.join(System.tmp_dir!(), "codex-sdk-app-server-home")
254259
`cwd` and `process_env` apply to the app-server child process. Per-thread working
255260
directories still belong on `working_directory` / `cwd` thread params.
256261

262+
For a managed remote app-server websocket instead of a local `codex app-server`
263+
child, use `Codex.AppServer.connect_remote/2`:
264+
265+
```elixir
266+
{:ok, conn} =
267+
Codex.AppServer.connect_remote(
268+
"wss://app-server.example/ws",
269+
auth_token_env: "CODEX_REMOTE_AUTH_TOKEN",
270+
client_name: "my_app",
271+
experimental_api: true
272+
)
273+
```
274+
275+
`connect_remote/2` keeps the same pid contract as `connect/2`, so the existing
276+
`Codex.AppServer.*` request helpers, `disconnect/1`, `alive?/1`, `subscribe/2`,
277+
`unsubscribe/1`, and `respond/3` work unchanged. Bearer auth headers are only
278+
attached for `wss://` or loopback `ws://` endpoints; plain non-loopback
279+
`ws://` plus `auth_token` or `auth_token_env` is rejected. Remote OAuth only
280+
supports `oauth: [storage: :memory]`; persistent child-login preflight
281+
(`:file` / `:auto`) is not available because remote mode does not spawn a local
282+
child or child `CODEX_HOME`.
283+
257284
`connect/2` also supports OAuth-aware child auth bootstrapping:
258285

259286
```elixir
@@ -293,6 +320,7 @@ App-server-only APIs include:
293320

294321
- `Codex.AppServer.thread_list/2`, `thread_archive/2`, `thread_read/3`, `thread_fork/3`, `thread_rollback/3`, `thread_loaded_list/2`
295322
- `Codex.AppServer.model_list/2`, `config_read/2`, `config_write/4`, `config_batch_write/3`, `config_requirements/1`
323+
- `Codex.AppServer.experimental_feature_list/2`, `experimental_feature_enablement_set/2`
296324
- `Codex.AppServer.fs_read_file/2`, `fs_write_file/3`, `fs_create_directory/3`, `fs_get_metadata/2`, `fs_read_directory/2`, `fs_remove/3`, `fs_copy/4`
297325
- `Codex.AppServer.plugin_list/2`, `plugin_read/3`, `plugin_install/4`, `plugin_uninstall/3`
298326
- `Codex.AppServer.skills_config_write/3`, `collaboration_mode_list/1`, `apps_list/2`
@@ -321,6 +349,21 @@ inside this repository.
321349
App-server v2 input blocks support `text`, `image`, `localImage`, `skill`, and `mention`.
322350
Legacy app-server v1 conversation flows are available via `Codex.AppServer.V1`.
323351

352+
Experimental feature enablement is forwarded without a stale local allowlist:
353+
354+
```elixir
355+
{:ok, %{"data" => features}} = Codex.AppServer.experimental_feature_list(conn)
356+
357+
{:ok, _} =
358+
Codex.AppServer.experimental_feature_enablement_set(conn,
359+
apps: true,
360+
plugins: false
361+
)
362+
```
363+
364+
The SDK forwards the `enablement` map as given and lets the server validate the
365+
current supported keys.
366+
324367
### Raw CLI Passthrough and Interactive Sessions
325368

326369
Use `Codex.CLI.run/2` when you want literal command-surface parity with the upstream terminal client, and `Codex.CLI.interactive/2` or `Codex.CLI.start/2` when you need a long-running or PTY-backed session.
@@ -380,6 +423,16 @@ IO.puts(session_result.stdout)
380423

381424
This layer is also the simplest way to reach CLI-only workflows such as `codex completion`, `codex cloud`, `codex execpolicy`, `codex features`, `codex mcp-server`, and the root interactive client without dropping down to `System.cmd/3` yourself.
382425

426+
Current upstream parity helpers also include:
427+
428+
- `Codex.CLI.interactive/2`, `resume/2`, and `fork/2` accept `remote:` and `remote_auth_token_env:`
429+
- `Codex.CLI.resume/2` accepts `include_non_interactive: true`
430+
- `Codex.CLI.app_server/1` forwards websocket auth flags: `ws_auth`, `ws_token_file`, `ws_shared_secret_file`, `ws_issuer`, `ws_audience`, and `ws_max_clock_skew_seconds`
431+
432+
`ws_auth` atoms normalize to upstream CLI values such as
433+
`:capability_token -> capability-token` and
434+
`:signed_bearer_token -> signed-bearer-token`.
435+
383436
### Streaming Responses
384437

385438
For real-time processing of events as they occur:
@@ -511,6 +564,13 @@ For bidirectional voice interactions using the OpenAI Realtime API:
511564
- Auth precedence for realtime/voice API keys is:
512565
`CODEX_API_KEY` -> `auth.json` `OPENAI_API_KEY` -> `OPENAI_API_KEY`.
513566

567+
`Codex.Realtime.Diagnostics.probe_text_turn/1` now uses a minimal
568+
schema-compatible probe and treats `unknown_parameter`-style schema drift as a
569+
protocol-incompatible skip reason instead of a hard failure. `Codex.Realtime.Session`
570+
also defers follow-up `response.create` calls until the active response reaches
571+
`response.done`, so overlapping user input and tool output no longer trigger
572+
premature create requests.
573+
514574
```elixir
515575
alias Codex.Realtime
516576

guides/03-api-guide.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,13 @@ Key entry points:
104104
- `interactive/2` — launches the root `codex` client (prompt mode or full interactive mode)
105105
- wrappers for `app`, `app_server`, `apply`, `cloud`, `cloud_list`, `cloud_exec`, `completion`, `debug_app_server_send_message_v2`, `execpolicy_check`, `features_*`, `login`, `logout`, `mcp_*`, `mcp_server`, `resume`, `fork`, and `sandbox`
106106

107+
Current parity notes:
108+
109+
- `interactive/2`, `resume/2`, and `fork/2` accept `remote:` plus `remote_auth_token_env:`
110+
- `resume/2` accepts `include_non_interactive: true`
111+
- `app_server/1` accepts websocket auth options: `ws_auth`, `ws_token_file`, `ws_shared_secret_file`, `ws_issuer`, `ws_audience`, and `ws_max_clock_skew_seconds`
112+
- `ws_auth` atoms normalize to the upstream CLI spellings such as `:capability_token -> capability-token`
113+
107114
`run/2` and the synchronous wrappers execute through the shared
108115
`CliSubprocessCore.Command` lane.
109116

@@ -163,7 +170,7 @@ such as `codex completion`, `codex cloud`, `codex execpolicy`, and
163170
The SDK supports both upstream external transports:
164171

165172
- **Exec JSONL (default `:exec` compatibility selector)**: `codex exec --json`
166-
- **App-server JSON-RPC (optional)**: `codex app-server` (newline-delimited JSON over stdio)
173+
- **App-server JSON-RPC (optional)**: managed local `codex app-server` over stdio or managed remote app-server over websocket
167174

168175
Select transport per-thread via `Codex.Thread.Options.transport`:
169176

@@ -180,6 +187,23 @@ temporary `CODEX_HOME`, pass `cwd:` and `process_env:` to `connect/2`. Those
180187
launch options affect the child process; per-thread cwd still belongs on thread
181188
params.
182189

190+
For a managed remote app-server websocket, use `connect_remote/2` instead of
191+
`connect/2`:
192+
193+
```elixir
194+
{:ok, conn} =
195+
Codex.AppServer.connect_remote(
196+
"wss://app-server.example/ws",
197+
auth_token_env: "CODEX_REMOTE_AUTH_TOKEN",
198+
experimental_api: true
199+
)
200+
```
201+
202+
Remote connections keep the same pid-compatible request surface as local ones.
203+
Bearer headers are only attached for `wss://` or loopback `ws://` endpoints.
204+
Remote OAuth only supports `oauth: [storage: :memory]`; `:file` / `:auto`
205+
storage is rejected because remote mode does not prepare a child `CODEX_HOME`.
206+
183207
`connect/2` also accepts `oauth:` for child-auth-aware startup:
184208

185209
```elixir
@@ -200,6 +224,7 @@ conversation APIs.
200224
The direct `Codex.AppServer` surface also includes upstream v2 filesystem,
201225
plugin, and thread-shell helpers:
202226

227+
- `experimental_feature_list/2` and `experimental_feature_enablement_set/2`
203228
- `fs_read_file/2`, `fs_write_file/3`, `fs_create_directory/3`, `fs_get_metadata/2`,
204229
`fs_read_directory/2`, `fs_remove/3`, `fs_copy/4`
205230
- `plugin_list/2`, `plugin_read/3`, `plugin_install/4`, `plugin_uninstall/3`
@@ -211,6 +236,10 @@ plus per-turn `service_tier` on `Codex.Thread.run/3` / `run_streamed/3`.
211236
`plugin_install/4` and `plugin_uninstall/3` accept `force_remote_sync: true`,
212237
and raw plugin maps preserve newer upstream fields such as `needsAuth`.
213238

239+
`experimental_feature_enablement_set/2` forwards the supplied `enablement` map
240+
without a local allowlist and lets the connected server validate the active
241+
feature keys.
242+
214243
## Codex.Subagents
215244

216245
`Codex.Subagents` wraps the deterministic pieces of a subagent workflow that

guides/05-app-server-transport.md

Lines changed: 53 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
# App-server Transport (JSON-RPC over stdio)
1+
# App-server Transport (JSON-RPC over stdio or websocket)
22

33
This guide covers using the **stateful** `codex app-server` transport from Elixir via `Codex.AppServer`.
44

55
The SDK supports two external Codex transports:
66

77
- **Exec JSONL (default `:exec` compatibility selector)**: `codex exec --json`
8-
- **App-server JSON-RPC (optional)**: `codex app-server` (newline-delimited JSON messages over stdio)
8+
- **App-server JSON-RPC (optional)**: managed local `codex app-server` over stdio or managed remote app-server over websocket
99

1010
Use app-server when you need upstream v2 APIs that are not exposed via exec JSONL (threads list/archive, skills/models/config APIs, server-driven approvals, etc.).
1111

@@ -30,6 +30,10 @@ If you need the literal command surface instead of the managed JSON-RPC connecti
3030
`Codex.CLI.app_server/1` launches a raw `codex app-server` subprocess session and
3131
`Codex.CLI.run/2` can be used for one-shot passthrough commands.
3232

33+
`Codex.CLI.app_server/1` also forwards the current websocket auth flags:
34+
`ws_auth`, `ws_token_file`, `ws_shared_secret_file`, `ws_issuer`,
35+
`ws_audience`, and `ws_max_clock_skew_seconds`.
36+
3337
## Connect / Disconnect
3438

3539
`Codex.AppServer.connect/2` starts a supervised `codex app-server` subprocess and performs the required `initialize``initialized` handshake automatically.
@@ -48,6 +52,25 @@ Pass `experimental_api: true` when you need upstream experimental fields such as
4852
:ok = Codex.AppServer.disconnect(conn)
4953
```
5054

55+
For a managed remote websocket connection, use `Codex.AppServer.connect_remote/2`:
56+
57+
```elixir
58+
{:ok, conn} =
59+
Codex.AppServer.connect_remote(
60+
"wss://app-server.example/ws",
61+
auth_token_env: "CODEX_REMOTE_AUTH_TOKEN",
62+
client_name: "my_app",
63+
experimental_api: true
64+
)
65+
66+
:ok = Codex.AppServer.disconnect(conn)
67+
```
68+
69+
Remote mode does not take `Codex.Options` because it does not spawn a local
70+
`codex` child. The returned pid stays compatible with the rest of the
71+
`Codex.AppServer` API surface, including `subscribe/2`, `unsubscribe/1`,
72+
`respond/3`, request helpers, and `disconnect/1`.
73+
5174
### Child cwd and environment isolation
5275

5376
`Codex.AppServer.connect/2` can also isolate the managed child process itself:
@@ -74,6 +97,10 @@ These launch options apply to the app-server child process. Per-thread working
7497
directories still belong on `thread/start`, `thread/resume`, or
7598
`Codex.Thread.Options`.
7699

100+
For `connect_remote/2`, `cwd` and `process_env` are only used to resolve auth
101+
context for SDK-managed OAuth helpers. They do not launch or mutate a child
102+
process because remote mode has no local child.
103+
77104
### OAuth-aware connect
78105

79106
Persistent child auth:
@@ -105,6 +132,22 @@ Notes:
105132
`chatgptAuthTokens`, and starts a connection-owned refresh responder
106133
- set `auto_refresh: false` when you want to handle
107134
`account/chatgptAuthTokens/refresh` yourself via `subscribe/2`
135+
- remote websocket mode only supports `storage: :memory`; persistent
136+
`:file` / `:auto` child-login preflight is rejected because there is no child
137+
`CODEX_HOME` to prepare
138+
139+
### Remote auth-token transport policy
140+
141+
Remote websocket auth supports both `auth_token:` and `auth_token_env:`.
142+
Bearer headers are only attached when the websocket URL is:
143+
144+
- `wss://...`
145+
- loopback `ws://127.0.0.1/...`
146+
- loopback `ws://localhost/...`
147+
148+
If you configure an auth token for a non-loopback plain `ws://` URL,
149+
`connect_remote/2` returns `{:error, {:invalid_remote_auth_transport, url}}`
150+
instead of sending credentials over an unsafe transport.
108151

109152
### Client identity
110153

@@ -168,12 +211,15 @@ App-server enables additional APIs that are not available via exec JSONL. Exampl
168211
{:ok, %{"data" => skills}} =
169212
Codex.AppServer.skills_list(conn, cwds: ["/path/to/project"], force_reload: true)
170213
{:ok, %{"data" => models}} = Codex.AppServer.model_list(conn, limit: 25)
214+
```
171215

172216
When you need feature-flag gating or to load the underlying `SKILL.md` contents,
173217
use `Codex.Skills.list/2` and `Codex.Skills.load/2`, which honor `features.skills`.
174218

219+
```elixir
175220
{:ok, %{"config" => config}} = Codex.AppServer.config_read(conn, include_layers: false)
176221
{:ok, _} = Codex.AppServer.config_write(conn, "features.web_search_request", true)
222+
{:ok, _} = Codex.AppServer.experimental_feature_enablement_set(conn, apps: true, plugins: false)
177223

178224
{:ok, %{"data" => threads, "nextCursor" => cursor}} = Codex.AppServer.thread_list(conn, limit: 10)
179225

@@ -191,6 +237,7 @@ IO.puts(Base.decode64!(encoded_back))
191237

192238
Additional v2 APIs include:
193239

240+
- `Codex.AppServer.experimental_feature_list/2` and `experimental_feature_enablement_set/2`
194241
- `Codex.AppServer.thread_read/3`, `thread_fork/3`, `thread_shell_command/3`, `thread_rollback/3`, `thread_loaded_list/2`
195242
- `Codex.AppServer.fs_read_file/2`, `fs_write_file/3`, `fs_create_directory/3`, `fs_get_metadata/2`, `fs_read_directory/2`, `fs_remove/3`, `fs_copy/4`
196243
- `Codex.AppServer.plugin_read/3`, `plugin_install/4`, `plugin_uninstall/3`
@@ -208,6 +255,10 @@ Current upstream routing and sync controls are also covered:
208255
`!` workflow, so treat it with the same care you would give shell access in the
209256
interactive CLI.
210257

258+
`experimental_feature_enablement_set/2` forwards the `enablement` map as given.
259+
The SDK does not keep a stale local allowlist; the connected app-server remains
260+
the source of truth for supported feature keys.
261+
211262
When `include_layers: true`, `config_read/2` returns a `layers` list. Recent Codex versions encode each layer's `name` as a tagged union (`ConfigLayerSource`), for example:
212263

213264
```elixir

guides/06-realtime-and-voice.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,11 @@ helps CLI/app-server flows; realtime and voice still need an API key or an
2121

2222
If your account has no credits, direct API calls may return `insufficient_quota` (HTTP 429). If your account lacks access to realtime models, calls may fail with `model_not_found`. When the upstream Realtime service itself returns a generic `server_error`, the realtime examples now run a minimal raw-WebSocket probe first and print `SKIPPED` with the detected `session_id` so you can report the exact upstream failure cleanly.
2323

24+
`Codex.Realtime.Diagnostics.probe_text_turn/1` keeps that probe minimal and now
25+
classifies `unknown_parameter`-style schema mismatches as a
26+
`realtime_protocol_incompatible` skip reason instead of a hard failure. This
27+
helps distinguish upstream schema drift from auth/quota/runtime failures.
28+
2429
Custom trust roots use `CODEX_CA_CERTIFICATE` first and `SSL_CERT_FILE` second. Blank values are
2530
ignored. The same PEM bundle is applied to HTTPS requests and secure realtime websockets.
2631

@@ -80,6 +85,11 @@ chunks
8085
end)
8186
```
8287

88+
If you queue additional text input or tool output while a response is still
89+
active, `Codex.Realtime.Session` defers the follow-up `response.create` until
90+
the current response reaches `response.done`. That keeps overlapping turns from
91+
issuing premature create requests.
92+
8393
### Receiving events
8494

8595
Realtime events are delivered as `{:session_event, event}`:

0 commit comments

Comments
 (0)