Skip to content

fix(codex): deduplicate copied branch history#989

Merged
ryoppippi merged 1 commit into
ryoppippi:mainfrom
OWConnoi:codex/dedupe-codex-branch-history
May 17, 2026
Merged

fix(codex): deduplicate copied branch history#989
ryoppippi merged 1 commit into
ryoppippi:mainfrom
OWConnoi:codex/dedupe-codex-branch-history

Conversation

@OWConnoi
Copy link
Copy Markdown
Contributor

@OWConnoi OWConnoi commented May 12, 2026

Fixes duplicated Codex Desktop usage when a branch/forked conversation copies historical JSONL token events into another session file.

What changed:

  • Adds a path-independent token event fingerprint in @ccusage/codex so copied historical token_count events are counted once across session files.
  • Keeps each file’s cumulative total tracking intact before dedupe, so new events in the branched session still produce the correct delta.
  • Adds a regression test that loads parent + branch session files with copied history and verifies only the new branch delta is counted.

Verification:

  • npx -y pnpm@10.30.1 --filter @ccusage/codex test src/data-loader.ts
  • npx -y pnpm@10.30.1 --filter @ccusage/codex typecheck
  • npx -y pnpm@10.30.1 run lint --fix from apps/codex
  • git diff --check

Closes #988.

Summary by CodeRabbit

  • Bug Fixes

    • Token usage events are now deterministically deduplicated so duplicate entries from copied or branched session histories are removed, yielding more accurate usage reports.
  • Tests

    • Test suite extended with scenarios simulating multi-session/branched histories to validate deduplication and ensure only unique events are returned.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9421f15f-0850-4734-ba12-a54641470b7b

📥 Commits

Reviewing files that changed from the base of the PR and between a263355 and 6a8c825.

📒 Files selected for processing (1)
  • apps/ccusage/src/adapter/codex/parser.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/ccusage/src/adapter/codex/parser.ts

📝 Walkthrough

Walkthrough

Adds deterministic token-usage event key and array deduplicator, applies deduplication inside loadTokenUsageEvents before sorting, and adds a Vitest case that stubs CODEX_HOME to verify duplicate events from branched sessions are removed.

Token Event Deduplication

Layer / File(s) Summary
Dedup key constant and creator
apps/ccusage/src/adapter/codex/parser.ts
Adds TOKEN_USAGE_EVENT_KEY_SEPARATOR and createTokenUsageEventKey() to produce a stable per-event key from timestamp, model, and token counts.
Integrate deduplication into loader
apps/ccusage/src/adapter/codex/parser.ts
Runs deduplicateTokenUsageEvents() on the flattened aggregated events in loadTokenUsageEvents() before sorting by timestamp.
Vitest: loadTokenUsageEvents dedupe case
apps/ccusage/src/adapter/codex/parser.ts
Adds a test that stubs CODEX_HOME with parent/branch session fixtures and asserts duplicates are removed (expects 2 events and checks token counts).

Estimated code review effort:
🎯 3 (Moderate) | ⏱️ ~20 minutes

"I nibble keys and stitch them tight,
A carrot-shaped separator glows at night.
Branched crumbs I chase and hide,
Leaving only unique hops beside.
Hop, dedupe, tally — all set right! 🥕"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(codex): deduplicate copied branch history' clearly and concisely describes the main change—adding deduplication logic for token events in copied branch sessions.
Linked Issues check ✅ Passed The PR directly addresses issue #988 by implementing global deduplication of token events across branched Codex sessions using a stable fingerprint, preventing double-counting of historical usage.
Out of Scope Changes check ✅ Passed All changes (deduplication utilities, updated loadTokenUsageEvents, and regression tests) are directly scoped to the stated objective of fixing duplicated token counting in branched Codex conversations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

apps/ccusage/src/adapter/codex/parser.ts

[baseline-browser-mapping] The data in this module is over two months old. To ensure accurate Baseline data, please update: npm i baseline-browser-mapping@latest -D
tsconfig.json is not found. we cannot use type-aware rules.

Oops! Something went wrong! :(

ESLint: 9.35.0

Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'eslint-plugin-format' imported from /node_modules/.pnpm/@antfu+eslint-config@4.19.0_@vue+compiler-sfc@3.5.30_eslint@9.35.0_typescript@5.9.2_vit_670a2c5c75d4275eabd7bc195a173ee6/node_modules/@antfu/eslint-config/dist/index.js
at Object.getPackageJSONURL (node:internal/modules/package_json_reader:301:9)
at packageResolve (node:internal/modules/esm/resolve:764:81)
at moduleResolve (node:internal/modules/esm/resolve:855:18)
at defaultResolve (node:internal/modules/esm/resolve:988:11)
at #cachedDefaultResolve (node:internal/modules/esm/loader:697:20)
at #resolveAndMaybeBlockOnLoaderThread (node:internal/modules/esm/loader:714:38)
at ModuleLoader.resolveSync (node:internal/modules/esm/loader:746:52)
at #resolve (node:internal/modules/esm/loader:679:17)
at ModuleLoader.getOrCreateModuleJob (node:internal/modules/esm/loader:599:35)
at node:internal/modules/esm/loader:628:32


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@han-cheng6
Copy link
Copy Markdown

han-cheng6 commented May 13, 2026

Thanks a lot for the incredibly quick turnaround on this.

From reading the PR, the fix looks aligned with the issue I reported in #988: Codex Desktop branched conversations should not cause historical usage to be counted again.

From the reporter side, this looks like the right direction. Happy to see this merged once CI is green and the maintainers are comfortable with it.

@ryoppippi ryoppippi force-pushed the codex/dedupe-codex-branch-history branch from c5a2e1f to 54abc64 Compare May 17, 2026 01:52
@ryoppippi
Copy link
Copy Markdown
Owner

Maintainer check after rebasing this onto current main:

  • Moved the fix to the current Codex adapter path after perf(ccusage): unify agent adapter foundations #1004: apps/ccusage/src/adapter/codex/parser.ts.
  • Confirmed the copied-branch-history regression test is red on main and green after the change.
  • Checked local Codex Desktop data. Before dedupe: 77,426 parsed token events, 2,722 duplicate events by timestamp/model/token fingerprint, with +287,576,105 totalTokens overcount. After the fix: loaded events have 0 duplicate fingerprints. The exact event count moved slightly while testing because local Codex logs are active, but the duplicate class is gone.

Validation run locally:

  • pnpm vitest run apps/ccusage/src/adapter/codex/parser.ts -t "deduplicates copied branch history"
  • pnpm run format
  • pnpm typecheck
  • pnpm run test

Decision: merge once bot checks finish. This one is directly reproduced by local real data and matches #988.

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 17, 2026

Open in StackBlitz

@ccusage/amp

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/amp@989

ccusage

npx https://pkg.pr.new/ryoppippi/ccusage@989

@ccusage/codex

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/codex@989

@ccusage/opencode

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/opencode@989

@ccusage/pi

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/pi@989

commit: 6a8c825

@ryoppippi ryoppippi force-pushed the codex/dedupe-codex-branch-history branch from 54abc64 to a263355 Compare May 17, 2026 01:58
@ryoppippi
Copy link
Copy Markdown
Owner

Follow-up: replaced the initial JSON.stringify-based fingerprint with a fixed-separator string key.

Local Codex data timing for loadTokenUsageEvents(), 5 runs each:

  • main checkout: about 2.00s average
  • this PR after the key change: about 2.00s average

So the previous CI large-fixture slowdown was likely from JSON.stringify allocation/serialization overhead. The branch has been force-pushed and CI is running again.

Deduplicate Codex token usage events with a session-independent fingerprint so branched or repeated session files do not count copied history more than once.

Add regression coverage for copied branch history and validate against local Codex logs, where the current parser produced thousands of duplicate token events.
@ryoppippi ryoppippi force-pushed the codex/dedupe-codex-branch-history branch from a263355 to 6a8c825 Compare May 17, 2026 02:07
@ryoppippi
Copy link
Copy Markdown
Owner

Updated after merging #1013 into main and rebasing this PR.

Head SHA: 6a8c825
Base includes: b438b4d (#1013)

Local validation after rebase:

  • pnpm vitest run apps/ccusage/src/adapter/codex/parser.ts -t "deduplicates copied branch history"

The perf workflow should now always write results to the Actions job summary and attempt the PR comment best-effort.

@ryoppippi ryoppippi merged commit f53bbb7 into ryoppippi:main May 17, 2026
15 checks passed
@ryoppippi
Copy link
Copy Markdown
Owner

thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@ccusage/codex double-counts tokens for branched Codex Desktop conversations

3 participants