Skip to content

feat(codex-probe): add update-check helper alongside known-bad warning#1861

Open
Rahgnoruk wants to merge 1 commit into
garrytan:mainfrom
Rahgnoruk:feat/codex-update-check
Open

feat(codex-probe): add update-check helper alongside known-bad warning#1861
Rahgnoruk wants to merge 1 commit into
garrytan:mainfrom
Rahgnoruk:feat/codex-update-check

Conversation

@Rahgnoruk
Copy link
Copy Markdown

Why

Codex CLI ships patch releases regularly and individual versions sometimes regress in ways the known-bad deny-list in _gstack_codex_version_check catches only retroactively. The deny-list is a static set of bad SHAs; it gives no signal when the current version is simply old.

Concretely: in my install I was running 0.122.0 for weeks, with a memory note about a stdin-deadlock workaround. Codex 0.136.0 had been on npm for days. /ship Step 11 had no freshness signal anywhere in the adversarial-review pipeline — only the deny-list. The stale workaround memory drove a bad routing decision (I almost skipped the Codex adversarial pass thinking Codex was still broken) before I noticed the upgrade by accident. Once I upgraded to 0.136.0 the old workaround was obsolete.

/ship should surface "you're behind npm latest" the same way it surfaces "you're on a known-bad". One INFO line, non-blocking, cached so it costs almost nothing.

What

New _gstack_codex_update_check in bin/gstack-codex-probe, wired into the existing detection blocks that /ship, /autoplan, /codex, and the design voice already use.

Behaviour:

  • Reads local version via codex --version.
  • 24h cache at \${GSTACK_HOME:-\$HOME/.gstack}/.codex-version-check. Avoids hammering registry.npmjs.org on every /ship invocation.
  • Cache miss: curl -fsSL -m 5 https://registry.npmjs.org/@openai/codex/latest | jq -r '.version'. 5s timeout so a slow registry never blocks a workflow.
  • Compares via `sort -V`. If local wins or ties → silent.
  • Otherwise prints ONE line:
    ```
    INFO: Codex CLI 0.122.0 available: 0.136.0. Upgrade: `npm install -g @openai/codex@latest`
    ```
  • Best-effort everywhere: missing curl/jq, offline, 5xx, malformed JSON, broken codex binary → all silent. Never blocks the workflow.
  • Independent from `_gstack_codex_version_check` — known-bad WARN and update INFO can both fire on the same invocation.

Out of scope (deliberate): auto-upgrade, prompt the user, per-project pinning, fail/skip on outdated. The user runs `npm install -g @openai/codex@latest` themselves; this just makes the staleness visible.

How it's wired

Surface Source
`/ship` Step 11 adversarial detection block `scripts/resolvers/review.ts`
`/review` Step 5.7 adversarial detection block same (shared resolver)
`/ship` review-army Codex design voice `scripts/resolvers/design.ts`
`/codex` Step 0.5 auth + version preflight `codex/SKILL.md.tmpl`
`/autoplan` Phase 0.5 Codex preflight `autoplan/SKILL.md.tmpl`

All non-Claude hosts (Cursor, Factory, GBrain, Hermes, Kiro, OpenClaw, OpenCode, Slate, .agents/Codex) inherit the check via the resolvers/tmpls when they regenerate locally — verified post-regen with grep across all host outputs.

Tests

8 new tests in `test/codex-hardening.test.ts` (35/37 pass locally; the 2 fails are pre-existing Windows narrow-PATH timeout-wrapper tests on `main`):

  • `stale local + fresh cache → INFO line with both versions`
  • `local == latest → silent`
  • `local ahead of cached latest (pre-release dev build) → silent`
  • `codex --version fails (binary present but unusable) → silent`
  • `network fetch fails (curl exits non-zero) → silent + no cache written` ← critical: failed fetch must not poison cache
  • `stale cache (>24h old mtime) is ignored — re-fetches from network`
  • `version_check + update_check are independent — bad version AND outdated both fire`
  • `cache miss writes the fetched version to the cache file`

Test plan to manually verify:

  • `rm -f ~/.gstack/.codex-version-check`
  • `source ~/.claude/skills/gstack/bin/gstack-codex-probe && _gstack_codex_update_check` — expect INFO line if outdated, silence if current
  • `cat ~/.gstack/.codex-version-check` — expect single semver on one line
  • Second invocation within 24h — expect no network call (verifiable via curl strace or by manually editing the cache)
  • Existing `_gstack_codex_version_check` still warns on 0.120.0/1/2 (tests cover this)
  • `bun test test/codex-hardening.test.ts` passes (37 tests, 2 pre-existing Windows fails)
  • `bun run gen:skill-docs --host all` produces no errors

🤖 Generated with Claude Code

_gstack_codex_update_check compares the locally installed Codex CLI to npm's
@openai/codex `latest` tag and prints one INFO line when an upgrade is
available. Wired into the same detection blocks as _gstack_codex_version_check
so /ship Step 11, /autoplan, /codex, and the review-army design voice all
surface a freshness signal on top of the existing known-bad-versions deny-list.

Results are cached for 24h at ${GSTACK_HOME:-$HOME/.gstack}/.codex-version-check
so /ship invocations don't hammer the npm registry. Network call is best-effort
(`-m 5` curl timeout, all failure modes silent) — never blocks the workflow.

Tony triggered this during /release-sync after PR garrytan#194 landed: codex-cli 0.122.0
→ 0.136.0 silently rendered the 0.122.0 hang workaround obsolete because /ship
had no codex-freshness signal — only the static known-bad list (0.120.0/1/2)
caught regressions.

Adds 8 tests in test/codex-hardening.test.ts covering the fresh-cache happy
path, identity (local == latest), pre-release (local ahead of latest), broken
codex binary, network failure (no cache written on failure), stale-cache
(>24h mtime) re-fetch, independence from _gstack_codex_version_check, and the
cache file write side-effect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@trunk-io
Copy link
Copy Markdown

trunk-io Bot commented Jun 4, 2026

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant