fix(global-discover): bucket codex by originator + read 128KB for CC cwd#1488
Open
0xDevNinja wants to merge 1 commit into
Open
fix(global-discover): bucket codex by originator + read 128KB for CC cwd#14880xDevNinja wants to merge 1 commit into
0xDevNinja wants to merge 1 commit into
Conversation
Two problems from issue garrytan#1315, one diff. Problem 1: `scanCodex` counted every rollout file as a `codex` session, conflating Codex Desktop (interactive codex dev) with codex_exec (cron / subagent) and Claude Code (CC driving codex via MCP). `/retro global` then narrated "codex was the primary tool, 414 sessions" when codex actually drove dev for one repo's middle phase and the other ~309 entries were CC firing codex as cross-model review. `payload.originator` is now normalized into a 4-bucket `codex_originators: { desktop, exec, claude_code, other }`. Surfaced under `tools.codex.originators` and per-repo `codex_originators`. Additive — existing `codex` totals stay, no consumer break. Problem 2: `extractCwdFromJsonl` read 8KB then parsed. Recent Claude Code / CCR JSONL files often open with a `queue-operation` event 30-50KB long that carries no `cwd`. The 8KB read truncated that line, JSON.parse failed, the fallback returned null, and the whole project dir got skipped. Akagilnc measured ~450 CC files vanishing this way in one repo's 31d window. Bumped to 128KB — same buffer size `scanCodex` already uses. Also exposes `CLAUDE_PROJECTS_DIR` env override (parallel to existing `CODEX_SESSIONS_DIR`) so the regression test can plant a fake project dir with a >30KB first line. Out of scope: the suggested annotation in `/retro global` output that "sessions" means "tool invocations / file count" (problem 3 in the issue). Reasonable separate PR. Fixes garrytan#1315
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two patches from #1315 in one diff.
scanCodexnow normalizespayload.originatorinto{ desktop, exec, claude_code, other }and surfaces the breakdown attools.codex.originatorsand per-repocodex_originators. Existingcodextotals stay (additive — no consumer break).extractCwdFromJsonlreads 128KB instead of 8KB. Recent Claude Code / CCR JSONL files often open with aqueue-operationevent 30-50KB long that has nocwd— the old 8KB read truncated the line, JSON.parse failed, and the whole project dir was silently dropped. Same buffer sizescanCodexalready uses.Fixes #1315.
Why
/retro globalnarrated "codex was the primary execution tool, 414 sessions across 7 repos" when codex actually drove dev for one repo's middle phase. The other ~309 codex_exec entries were CC firing codex as cross-model review subagent. A single bucket can't tell those apart.For the CC count: @Akagilnc traced ~450 missing files in one repo's 31d window to the 8KB cap (issue thread). First-line
queue-operationevents are 30-50KB on recent CC versions; the parser never reached the later events that carrycwd.Shape
Summary format inline shows
Codex:98 (desktop=1, exec=97, cc=0)per repo + a top-lineCodex originators: ...rollup.Originator normalization (case-insensitive, matches values observed in
~/.codex/sessions/):"Codex Desktop"/"codex_desktop"→desktop"codex_exec"/"codex exec"→exec"Claude Code"/"claude_code"→claude_codeother(not dropped — future originators land here visibly until we map them)Also adds a
CLAUDE_PROJECTS_DIRenv override onscanClaudeCodeso the CC regression test can stage a fake project dir; mirrors the existingCODEX_SESSIONS_DIRknob.Out of scope
Problem 3 from the issue (annotate
/retro globalthat "sessions = tool invocations / file count", not interactive dev) is a narrative-side change in the retro skill template. Reasonable to file separately.Tests
7 new in
test/global-discover.test.ts:'Codex Desktop'originator →desktopbucket.'codex_exec'→execbucket.'Claude Code'→claude_codebucket.other(verifies nothing is dropped).codex_originatorssums to per-reposessions.codex.tools.codex.originatorsshape + total parity withtools.codex.total_sessions.queue-operationevent still resolvescwdfrom a later event.