fix(global-discover): bucket codex by originator + read 128KB for CC cwd by 0xDevNinja · Pull Request #1488 · garrytan/gstack

0xDevNinja · 2026-05-14T08:50:34Z

Summary

Two patches from #1315 in one diff.

Codex session bucketing. scanCodex now normalizes payload.originator into { desktop, exec, claude_code, other } and surfaces the breakdown at tools.codex.originators and per-repo codex_originators. Existing codex totals stay (additive — no consumer break).
CC undercount. extractCwdFromJsonl reads 128KB instead of 8KB. Recent Claude Code / CCR JSONL files often open with a queue-operation event 30-50KB long that has no cwd — the old 8KB read truncated the line, JSON.parse failed, and the whole project dir was silently dropped. Same buffer size scanCodex already uses.

Why

/retro global narrated "codex was the primary execution tool, 414 sessions across 7 repos" when codex actually drove dev for one repo's middle phase. The other ~309 codex_exec entries were CC firing codex as cross-model review subagent. A single bucket can't tell those apart.

For the CC count: @Akagilnc traced ~450 missing files in one repo's 31d window to the 8KB cap (issue thread). First-line queue-operation events are 30-50KB on recent CC versions; the parser never reached the later events that carry cwd.

Shape

// tools.codex now:
{
  "total_sessions": 414,
  "repos": 7,
  "originators": { "desktop": 92, "exec": 309, "claude_code": 13, "other": 0 }
}

// per repo:
{
  "name": "ak-ai-vela",
  "sessions": { "claude_code": 12, "codex": 98, "gemini": 0 },
  "codex_originators": { "desktop": 1, "exec": 97, "claude_code": 0, "other": 0 }
}

Summary format inline shows Codex:98 (desktop=1, exec=97, cc=0) per repo + a top-line Codex originators: ... rollup.

Originator normalization (case-insensitive, matches values observed in ~/.codex/sessions/):

"Codex Desktop" / "codex_desktop" → desktop
"codex_exec" / "codex exec" → exec
"Claude Code" / "claude_code" → claude_code
anything else → other (not dropped — future originators land here visibly until we map them)

Also adds a CLAUDE_PROJECTS_DIR env override on scanClaudeCode so the CC regression test can stage a fake project dir; mirrors the existing CODEX_SESSIONS_DIR knob.

Out of scope

Problem 3 from the issue (annotate /retro global that "sessions = tool invocations / file count", not interactive dev) is a narrative-side change in the retro skill template. Reasonable to file separately.

Tests

7 new in test/global-discover.test.ts:

'Codex Desktop' originator → desktop bucket.
'codex_exec' → exec bucket.
'Claude Code' → claude_code bucket.
Unknown originator string → other (verifies nothing is dropped).
Per-repo codex_originators sums to per-repo sessions.codex.
tools.codex.originators shape + total parity with tools.codex.total_sessions.
CC JSONL whose first line is a >30KB queue-operation event still resolves cwd from a later event.

bun test test/global-discover.test.ts
# 26 pass, 0 fail (19 existing + 7 new)

Two problems from issue garrytan#1315, one diff. Problem 1: `scanCodex` counted every rollout file as a `codex` session, conflating Codex Desktop (interactive codex dev) with codex_exec (cron / subagent) and Claude Code (CC driving codex via MCP). `/retro global` then narrated "codex was the primary tool, 414 sessions" when codex actually drove dev for one repo's middle phase and the other ~309 entries were CC firing codex as cross-model review. `payload.originator` is now normalized into a 4-bucket `codex_originators: { desktop, exec, claude_code, other }`. Surfaced under `tools.codex.originators` and per-repo `codex_originators`. Additive — existing `codex` totals stay, no consumer break. Problem 2: `extractCwdFromJsonl` read 8KB then parsed. Recent Claude Code / CCR JSONL files often open with a `queue-operation` event 30-50KB long that carries no `cwd`. The 8KB read truncated that line, JSON.parse failed, the fallback returned null, and the whole project dir got skipped. Akagilnc measured ~450 CC files vanishing this way in one repo's 31d window. Bumped to 128KB — same buffer size `scanCodex` already uses. Also exposes `CLAUDE_PROJECTS_DIR` env override (parallel to existing `CODEX_SESSIONS_DIR`) so the regression test can plant a fake project dir with a >30KB first line. Out of scope: the suggested annotation in `/retro global` output that "sessions" means "tool invocations / file count" (problem 3 in the issue). Reasonable separate PR. Fixes garrytan#1315

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(global-discover): bucket codex by originator + read 128KB for CC cwd#1488

fix(global-discover): bucket codex by originator + read 128KB for CC cwd#1488
0xDevNinja wants to merge 1 commit into
garrytan:mainfrom
0xDevNinja:fix/1315-codex-originator-and-cc-truncation

0xDevNinja commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

0xDevNinja commented May 14, 2026

Summary

Why

Shape

Out of scope

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant