Skip to content

perf(ccusage): optimize bundled cli performance#984

Merged
ryoppippi merged 147 commits into
mainfrom
perf
May 15, 2026
Merged

perf(ccusage): optimize bundled cli performance#984
ryoppippi merged 147 commits into
mainfrom
perf

Conversation

@ryoppippi
Copy link
Copy Markdown
Owner

@ryoppippi ryoppippi commented May 12, 2026

Try This PR

The ccusage package should be installable through pkg.pr.new once the preview package workflow has published this PR.

pnpm dlx https://pkg.pr.new/ryoppippi/ccusage@984 -- --offline

Do not use this form; it does not work for this preview package:

bun x https://pkg.pr.new/ryoppippi/ccusage@984

With Bun available on PATH, the published launcher starts the bundled Bun entrypoint. Set CCUSAGE_BUN_AUTO_RUN=0 to force the Node path.

Summary

This PR optimizes the bundled JavaScript ccusage CLI while keeping output behavior aligned with main. The main changes are faster JSONL discovery/loading, worker-thread parsing for built dist usage, smaller worker payloads, faster aggregation/table hot paths, Bun-aware launcher support, CLI-only package exports, and CI fixture performance reporting.

The latest pass fixes a Bun stdout edge case: large final JSON/table writes now wait for the stream write callback. Without that, piping a large report through tools such as jq, cat, or tee could truncate stdout at 64 KiB while direct terminal output looked fine. CI preview E2E now runs daily --offline --json | jq -e . on the Node-forced path and the Bun-available path, and the fixture is large enough to catch this class of truncation.

Current Benchmarks

Environment: Apple M3 Pro macOS, Bun built entry, real local Claude data, LOG_LEVEL=0, COLUMNS=200, hyperfine --warmup 4 --runs 10. These local numbers are directional; CI fixture compare is the repeatable PR signal.

Command Current JS/Bun
daily --offline --json 329.4ms ± 5.2ms
session --offline --json 342.5ms ± 13.3ms
blocks --offline --json 387.3ms ± 7.0ms
daily --offline table 354.4ms ± 22.1ms

Latest commit 213cb95 replaces the hot string-key dedupe Maps with null-prototype object indexes after CPU profiling showed Map#set dominating the post-worker merge. Compared with the post-stdout-drain run, this improved daily 336.7ms -> 329.4ms, session 346.2ms -> 342.5ms, and blocks 389.7ms -> 387.3ms.

Minify A/B on valid /dist/index.js paths before the stdout-drain fix:

Command Non-minified Minified
daily --offline --json 344.2ms ± 11.6ms 342.0ms ± 8.8ms
session --offline --json 368.4ms ± 23.5ms 351.0ms ± 10.3ms
blocks --offline --json 397.9ms ± 16.9ms 389.8ms ± 5.4ms

Main And Rust References

Latest CI fixture compare against the base branch, using built apps/ccusage/dist/index.js through bun -b, --offline --json, mitata with 2 warmups and 7 samples:

Command Base median PR median PR vs base
daily --offline --json 366.8ms 335.6ms 1.09x faster
session --offline --json 365.3ms 330.9ms 1.10x faster
blocks --offline --json 357.0ms 333.7ms 1.07x faster

Rust reference: #977. Latest local Rust measurement on branch rust at f8d6cd3, using target/release/ccusage (937 KiB) with LOG_LEVEL=0, --offline --json, hyperfine --warmup 4 --runs 10: daily 249.3ms ± 6.1ms, session 255.4ms ± 7.0ms, blocks 260.7ms ± 9.0ms. JS/Bun is now close enough to be practical, but this Rust binary is still faster across daily/session/blocks.

Package Size

Current build output writes sourcemaps and reports about 1.08 MB on disk. Runtime JS is about 124.6 kB. Latest CI packed package comparison: base 133.73 KiB, PR 43.26 KiB, 90.47 KiB smaller (3.09x smaller). Sourcemaps are excluded from the package.

Correctness

Checked output parity for JSON output across daily, session, monthly, weekly, and blocks between minified and non-minified builds; blocks[].projection is time-dependent and was removed for that comparison. Representative table output matched for daily and session.

Large-output pipe checks after the stdout-drain fix:

Command Result
`daily --offline --json jq -e .`
`blocks --offline --json jq -e .`
enlarged preview E2E `daily --offline --json jq -e .`

Preview E2E passed on ubuntu-slim, macos-latest, and windows-latest. The Windows log shows both the Node-forced run and the Bun-available run executing npx --yes "$CCUSAGE_PREVIEW_URL" daily --offline --json | jq -e ., followed by visible table output with the expected Total row.

Known note from earlier Rust/JS comparisons: Rust and JS previously differed in some session row counts on local comparisons while token totals matched. Any remaining semantic differences should stay documented before merge.

Validation

Recent validation:

  • pnpm run format
  • pnpm typecheck
  • pnpm run test (32 files, 376 passed, 1 skipped)
  • pnpm --filter ccusage build
  • Built Bun daily --offline --json | jq -e .
  • Built Bun blocks --offline --json | jq -e .
  • Enlarged preview E2E fixture local check with daily --offline --json | jq -e .
  • CI preview E2E on Linux, macOS, and Windows

Latest Update (May 15, 2026, worker-side dedupe experiment)

Pushed empty commit e29d228 to record a rejected worker-side chunk dedupe prototype. The prototype removed duplicate daily/session rows inside each worker before postMessage, carried source-order metadata for replacement tie-breaks, and tried to restore first-occurrence merge order before final aggregation.

I did not adopt it. JSON output was not byte-identical: token totals stayed stable, but cost fields moved at the IEEE754 tail because aggregation order changed. An intermediate shape also changed session totals because session/project metadata is encoded at the file payload level, so moving a winning row into the first-seen file slot is not semantics-preserving.

Built Bun benchmark on real local Claude data, LOG_LEVEL=0, COLUMNS=200, hyperfine --warmup 4 --runs 10:

Command Worker-dedupe prototype Current adopted build Result
daily --offline --json 363.1ms ± 13.5ms 329.4ms ± 5.2ms slower
session --offline --json 362.3ms ± 12.8ms 342.5ms ± 13.3ms slower
blocks --offline --json 411.9ms ± 25.2ms 387.3ms ± 7.0ms slower

After reverting the prototype and rebuilding ccusage, daily, session, and blocks JSON output matched the saved pre-experiment baseline byte-for-byte.

Latest Update (May 15, 2026, object model summary indexes)

Pushed 4864bf2 for #984. Summary aggregation now uses null-prototype object indexes plus an insertion-order model list instead of allocating Map/Set for every daily/session/bucket group. This follows the same exact-string-key pattern that helped the row dedupe path, while preserving modelsUsed first-seen order and model breakdown sorting.

Correctness check: built Bun dist matched the pre-change output byte-for-byte for daily, session, monthly, weekly, and blocks --offline --json; daily and session table output were also checked and matched byte-for-byte. Added a regression test for the subtle unknown-model behavior: an undefined model contributes an unknown model breakdown, but modelsUsed only includes unknown when that model name is explicit.

Worker-enabled A/B, real local Claude data, LOG_LEVEL=0, COLUMNS=200, hyperfine --warmup 4 --runs 12 with both builds under /tmp/*/dist/index.js:

Command Base Object indexes Result
daily --offline --json 348.9ms ± 12.7ms 349.5ms ± 20.0ms neutral
session --offline --json 354.1ms ± 11.8ms 348.5ms ± 4.0ms ~1.02x faster
blocks --offline --json 400.9ms ± 19.9ms 396.0ms ± 11.3ms ~1.01x faster

Validation: pnpm run format, pnpm typecheck, pnpm run test (32 files, 377 passed, 1 skipped), pnpm --filter ccusage build, and built Bun daily/blocks --offline --json | jq -e ..

Summary by CodeRabbit

  • New Features

    • Automatic Bun re-run when available in PATH for faster warm startup (disable with CCUSAGE_BUN_AUTO_RUN=0)
    • Added --singleThread flag to disable parallel JSONL file loading
  • Bug Fixes

    • Removed --jq CLI option; use piped jq with --json output instead
  • Documentation

    • Removed library usage documentation; package is now CLI-only
    • Updated guides for new Bun auto-run behavior and JSON piping approach

ryoppippi and others added 7 commits May 11, 2026 21:47
Add a mitata-based script that executes the built ccusage CLI with Node. The default bounded benchmark runs dist/index.js --offline --json and ignores stdout so timings measure command work rather than terminal rendering.

The harness reports Node version, built CLI size, arguments, and bounded latency statistics. It also supports custom ccusage arguments after -- and --full for mitata summary output when the target is light enough for mitata default sample tuning.

Performance notes: on an empty Claude data directory the bounded smoke benchmark measured 58.99 ms for one sample with dist/index.js at 125.66 KiB. The real-log baseline should be compared against origin/main under the same built Node CLI conditions, not against bunx or an interrupted run.
Remove the eager timestamp sort from daily and session loading. That sort scanned every JSONL file to find an earliest timestamp and then parsed all files again, which doubled the read cost for large Claude histories.

Deduplication now tracks the retained entry index for each message/request hash and replaces it only when a later parse finds an older timestamp. This keeps the existing oldest-duplicate-wins behaviour without requiring file-level chronological pre-sorting.

Date grouping now reuses Intl.DateTimeFormat instances by timezone and locale instead of constructing a formatter for every usage entry.

Performance: with 3,119 JSONL files and 399,396 lines of real Claude logs, built Node CLI ccusage --offline --json improved from origin/main /usr/bin/time avg 12.86 s over 3 runs (12.94, 12.79, 12.86) to this branch avg 5.13 s over 3 runs (5.21, 5.05, 5.14), about 2.5x faster on Node v24.15.0. Bundle impact stayed small: dist/index.js remained 128.67 kB / 31.06 kB gzip; total dist changed from 569.93 kB to 570.72 kB.
Filter JSONL files by mtime when --since is provided, using a one-day buffer so date-range reports can avoid scanning cold historical files. Also skip lines without the required input_tokens marker before JSON.parse/Valibot validation in the ccusage loaders.

This incorporates the low-risk file filtering and line precheck ideas from PR #869 and the date-pruning direction from PR #877, scoped to ccusage.

Benchmarks on local real Claude logs, Node v24.15.0, built CLI, 3 samples:

- ccusage --offline --json: 4.50s avg (min 3.99s, max 5.49s)

- ccusage --offline --json --since 20260501: 261.14ms avg (min 235.00ms, max 308.73ms)

Verification:

- pnpm run format

- pnpm typecheck

- pnpm run test

- pnpm --filter ccusage build

Co-authored-by: pbuchman <368465+pbuchman@users.noreply.github.com>

Co-authored-by: jleechan2015 <13840161+jleechan2015@users.noreply.github.com>
Add an internal minUpdateTime loader option and use it from statusline so today cost only scans files touched since local midnight and active block discovery only scans files touched in the last 24 hours.

This incorporates the statusline time-window pruning idea from PR #623 without bringing in its broader cache implementation.

Verification:

- pnpm run format

- pnpm typecheck

- pnpm run test

- pnpm --filter ccusage build

- cat apps/ccusage/test/statusline-test-sonnet4.json | node apps/ccusage/dist/index.js statusline --offline

Note: pnpm --filter ccusage test:statusline:sonnet4 currently fails before this change on Node v24 because node ./src/index.ts hits ERR_IMPORT_ATTRIBUTE_MISSING for package.json; the built CLI smoke above passes.

Co-authored-by: Szpadel <1857251+Szpadel@users.noreply.github.com>
Prefer the duplicate message/request entry with the largest token total instead of keeping the chronologically oldest record. Claude logs can contain an initial partial usage record followed by a more complete one with the same message id and request id; keeping the partial record under-counts daily, monthly and blocks output.

Apply the same replacement rule to session block aggregation and cover the behaviour with regression tests for daily and blocks reports.
Parse common usage JSONL rows with a lightweight hot path before falling back to JSON parsing and structural validation. Daily loading now reads JSONL files concurrently up to the available CPU parallelism capped at 16, then applies dedupe in file order so output ordering and duplicate handling stay stable.

Benchmark: mitata on Apple M3 Pro, node 24.15.0, built JS CLI at apps/ccusage/dist/index.js, Rust release CLI at /Users/ryoppippi/ghq/github.com/ryoppippi/ccusage/target/release/ccusage, LOG_LEVEL=0, --offline --json, min_samples=5.

Results: JS daily 3.85s, session 3.99s, monthly 3.94s, blocks 8.36s. Rust daily 630.05ms, session 643.02ms, monthly 622.38ms, blocks 649.11ms.

Previous local baseline after duplicate-entry fixes: JS daily 4.09s, session 3.91s, monthly 4.00s, blocks 8.60s; Rust daily 641ms, session 701ms, monthly 694ms, blocks 644ms.

Correctness: built JS and Rust outputs matched for daily/session/monthly/blocks token totals after sorting by stable keys. Cost totals can still differ by floating-point formatting only.

Verification: pnpm run format; pnpm typecheck; pnpm --filter ccusage exec vitest run src/data-loader.ts; pnpm --filter ccusage run build; pnpm run test.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ac5cf3cc-57aa-4540-8296-1eed89d64f89

📥 Commits

Reviewing files that changed from the base of the PR and between c874456 and 36bd6fa.

⛔ Files ignored due to path filters (2)
  • flake.lock is excluded by !**/*.lock
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (95)
  • .github/scripts/e2e-setup.mjs
  • .github/scripts/upsert-pr-comment.ts
  • .github/workflows/ccusage-perf.yaml
  • .github/workflows/ci.yaml
  • CLAUDE.md
  • apps/amp/package.json
  • apps/amp/src/commands/daily.ts
  • apps/amp/src/commands/monthly.ts
  • apps/amp/src/commands/session.ts
  • apps/amp/src/data-loader.ts
  • apps/ccusage/CLAUDE.md
  • apps/ccusage/README.md
  • apps/ccusage/config-schema.json
  • apps/ccusage/package.json
  • apps/ccusage/scripts/bench.mjs
  • apps/ccusage/scripts/compare-pr-performance.ts
  • apps/ccusage/scripts/generate-json-schema.ts
  • apps/ccusage/scripts/generate-large-fixture.ts
  • apps/ccusage/src/_date-utils.ts
  • apps/ccusage/src/_env.ts
  • apps/ccusage/src/_jq-processor.ts
  • apps/ccusage/src/_pricing-fetcher.ts
  • apps/ccusage/src/_session-blocks.ts
  • apps/ccusage/src/_shared-args.ts
  • apps/ccusage/src/cli.ts
  • apps/ccusage/src/commands/_session_id.ts
  • apps/ccusage/src/commands/blocks.ts
  • apps/ccusage/src/commands/daily.ts
  • apps/ccusage/src/commands/monthly.ts
  • apps/ccusage/src/commands/session.ts
  • apps/ccusage/src/commands/statusline.ts
  • apps/ccusage/src/commands/weekly.ts
  • apps/ccusage/src/data-loader.ts
  • apps/ccusage/src/debug.ts
  • apps/ccusage/src/index.ts
  • apps/ccusage/src/logger.ts
  • apps/ccusage/src/main.bun.ts
  • apps/ccusage/src/main.node.ts
  • apps/ccusage/src/main.ts
  • apps/ccusage/test/cli-output.test.ts
  • apps/ccusage/test/fixtures/claude/projects/project-alpha/session-alpha/chat.jsonl
  • apps/ccusage/test/fixtures/claude/projects/project-beta/session-beta/chat.jsonl
  • apps/ccusage/test/snapshots/cli-output/blocks-json.txt
  • apps/ccusage/test/snapshots/cli-output/blocks-table.txt
  • apps/ccusage/test/snapshots/cli-output/daily-json.txt
  • apps/ccusage/test/snapshots/cli-output/daily-table.txt
  • apps/ccusage/test/snapshots/cli-output/monthly-json.txt
  • apps/ccusage/test/snapshots/cli-output/session-json.txt
  • apps/ccusage/test/snapshots/cli-output/session-table.txt
  • apps/ccusage/test/snapshots/cli-output/weekly-json.txt
  • apps/ccusage/tsdown.config.ts
  • apps/codex/README.md
  • apps/codex/package.json
  • apps/codex/src/daily-report.ts
  • apps/codex/src/data-loader.ts
  • apps/codex/src/monthly-report.ts
  • apps/codex/src/session-report.ts
  • apps/opencode/README.md
  • apps/opencode/package.json
  • apps/opencode/src/commands/daily.ts
  • apps/opencode/src/commands/monthly.ts
  • apps/opencode/src/commands/weekly.ts
  • apps/pi/package.json
  • apps/pi/src/data-loader.ts
  • docs/.vitepress/config.ts
  • docs/CLAUDE.md
  • docs/guide/cli-options.md
  • docs/guide/codex/index.md
  • docs/guide/config-files.md
  • docs/guide/configuration.md
  • docs/guide/environment-variables.md
  • docs/guide/getting-started.md
  • docs/guide/installation.md
  • docs/guide/json-output.md
  • docs/guide/library-usage.md
  • docs/guide/monthly-reports.md
  • docs/guide/session-reports.md
  • docs/package.json
  • docs/tsconfig.json
  • docs/typedoc.config.ts
  • docs/update-api-index.ts
  • flake.nix
  • package.json
  • packages/internal/CLAUDE.md
  • packages/internal/package.json
  • packages/internal/src/array.ts
  • packages/internal/src/json-file-state.ts
  • packages/internal/src/logger.ts
  • packages/internal/src/pricing.ts
  • packages/internal/src/sort.ts
  • packages/terminal/package.json
  • packages/terminal/src/table.ts
  • packages/terminal/src/text-width.ts
  • packages/terminal/src/utils.ts
  • pnpm-workspace.yaml

📝 Walkthrough

Walkthrough

This PR implements a comprehensive multi-phase refactoring: replacing consola with a custom logger, removing the --jq feature entirely, standardizing output via async writeStdoutLine, centralizing sorting utilities, refactoring session blocks and pricing logic for performance, rewriting terminal table rendering, introducing a new CLI launcher with Bun auto-run support, and adding performance benchmarking infrastructure including CI workflows and test snapshots.

Changes

Core Infrastructure Refactoring

Layer / File(s) Summary
Logger replacement and output standardization
packages/internal/src/logger.ts, apps/ccusage/src/logger.ts
Replaces consola with custom Logger type (numeric level, typed methods for warn/info/error/debug/trace/log/box). Adds writeLineAsync and writeStdoutLine async helpers that resolve only after stream write callback completes, enabling reliable stdout in piped scenarios. Exports new helpers from both modules.
Centralized sorting utilities and date refactoring
packages/internal/src/sort.ts, apps/ccusage/src/_date-utils.ts
Creates shared compareStrings, compareStringsDesc, compareStringsByOrder and generic sortByString/sortByDate functions in new sort module. Refactors date-utils to re-export sortByDate from internal package, adds createCachedDateFormatter for memoized YYYY-MM-DD formatting, and introduces getDateStringWeek for ISO week computation from date strings.
Terminal text width calculation and table rendering
packages/terminal/src/{text-width.ts,table.ts}, packages/terminal/src/utils.ts
Adds getStringWidth(text) with Bun-aware calculation (grapheme segmentation, wide code points, emoji handling, ASCII fast path). Replaces cli-table3 with in-file fast table renderer using cell truncation/wrapping, ANSI-aware padding, and manual border drawing. Updates formatDateCompact to remove locale parameter and use timezone-aware fast path. Switches model deduplication from uniq() to Set-based approach.
JSON-backed persistence and session blocks refactoring
packages/internal/src/json-file-state.ts, apps/ccusage/src/_session-blocks.ts, apps/ccusage/src/commands/statusline.ts
Introduces createJsonFileState<T>(filePath) for persistent JSON state with Symbol.dispose pattern. Refactors session-blocks to support optional timestampMs field, adds ms-based helpers (getEntryTimestampMs, floorToHourMs) to avoid repeated Date allocations, updates block identification to use ms arithmetic. Updates statusline to use createJsonFileState for semaphore and adds minUpdateTime boundaries to data loaders.
Pricing cache refactoring
packages/internal/src/pricing.ts
Extracts tiered pricing into new calculateTieredCost helper with fixed 200k token threshold. Refactors LiteLLMPricingFetcher to maintain per-model lookup cache storing matched pricing or null for cache misses. Updates getModelPricing to consult cache first and populate on direct and case-insensitive matches.
Array allocation utilities
packages/internal/src/array.ts
Adds createResultSlots<T>(length) for allocating sparse arrays via .length assignment on empty arrays, enabling pre-allocation for indexed writes.

–jq Feature Removal

Layer / File(s) Summary
Remove jq processor module
apps/ccusage/src/_jq-processor.ts (deleted)
Removes processWithJq utility and its tests that converted JSON to strings, ran jq CLI via nano-spawn, and handled jq errors.
Remove jq from schema and CLI args
apps/ccusage/config-schema.json, apps/ccusage/src/_shared-args.ts
Removes jq string option from schema defaults and all command sections. Removes jq CLI argument definition from shared args.
Update commands to remove jq processing and use writeStdoutLine
apps/ccusage/src/commands/{daily,monthly,weekly,session,blocks}.ts, apps/ccusage/src/commands/_session_id.ts
Removes processWithJq and Result imports across all commands. Updates useJson to depend only on mergedOptions.json. Replaces logger-based output with async writeStdoutLine for both JSON and table results. Removes conditional --jq post-processing paths. Updates SessionIdContext to remove optional jq field.

Command Updates: Sorting and Output

Layer / File(s) Summary
Amp command sorting updates
apps/amp/src/commands/{daily,monthly,session}.ts, apps/amp/src/data-loader.ts
Imports compareStrings and uses it for sorting by date/month/lastActivity/timestamp instead of localeCompare or epoch comparisons.
Codex report sorting updates
apps/codex/src/{daily-report,monthly-report,session-report,data-loader}.ts
Imports compareStrings for sorting summaries by date/month and events by timestamp instead of localeCompare or Date-based comparisons.
OpenCode command sorting updates
apps/opencode/src/commands/{daily,monthly,weekly}.ts
Imports compareStrings for sorting results by date/month/week instead of localeCompare.
Pi data loader sorting updates
apps/pi/src/data-loader.ts
Imports compareStringsByOrder for order-aware sorting of daily/session/monthly data by date/activity/month.
Ccusage sorting and pricing updates
apps/ccusage/src/{debug.ts,_pricing-fetcher.ts}
Updates debug to use collectJsonlFiles instead of glob pattern. Exports CLAUDE_PROVIDER_PREFIXES.

CLI Architecture and Bun Auto-Run

Layer / File(s) Summary
Environment variable constants for Bun control
apps/ccusage/src/_env.ts
Adds exported constants for CCUSAGE_BUN_AUTO_RUN_ENV and CCUSAGE_BUN_AUTO_RUN_DISABLED_VALUE.
CLI launcher with Bun auto-run and PATH scanning
apps/ccusage/src/cli.ts
Introduces platform-aware executable detection with Windows PATHEXT handling. Implements runCli(argv) that selects Node vs Bun entrypoint based on Bun availability and auto-run disable flag, spawns selected process with inherited stdio, includes Vitest tests for PATH scanning.
Main entry points for Node and Bun
apps/ccusage/src/{main.ts,main.node.ts,main.bun.ts}, apps/ccusage/src/index.ts
Introduces async main() function calling run(). Adds Node and Bun shebang entrypoints with top-level await. Updates index.ts to import and await main().
CLI package.json and build config updates
apps/ccusage/package.json, apps/ccusage/tsdown.config.ts
Updates bin entry to ./src/cli.ts, reduces exports to only . and ./package.json, removes TypeScript exports. Updates tsdown to target multiple entry files, enable sourcemaps, fully minify, disable dts, add inline const optimization, and control comments.

Performance Benchmarking Infrastructure

Layer / File(s) Summary
Performance CI workflow and setup
.github/workflows/ccusage-perf.yaml, .github/scripts/{e2e-setup.mjs,upsert-pr-comment.ts}
Adds workflow building ccusage for base/PR, generating large fixture, running comparison script, and upserting markdown report. Adds e2e setup script parsing preview URLs, creating Claude directories, generating chat.jsonl with 420 usage entries. Adds Bun script to fetch/update PR comments via GitHub API.
Local and CI benchmark scripts
apps/ccusage/scripts/{bench.mjs,compare-pr-performance.ts}
Adds Node benchmark runner with mitata sampling and full summary modes. Adds Bun script benchmarking base/head ccusage, measuring packed sizes, spawning hyperfine, and rendering markdown reports with per-fixture tables.
Large fixture generation for benchmarking
apps/ccusage/scripts/generate-large-fixture.ts
Adds Bun script synthesizing large Claude JSONL fixtures matching real-world file-size profile, writing deterministic JSONL lines with unique ids/models/tokens/padded content.
CLI output test snapshots and fixtures
apps/ccusage/test/{cli-output.test.ts,fixtures/**,snapshots/**}
Adds test suite running CLI subcommands in offline/JSON and table modes against temporary fixtures. Adds JSONL chat fixtures for test projects. Adds JSON and table snapshot files for all report outputs.

Documentation and Configuration Updates

Layer / File(s) Summary
Package.json and runtime engine updates
package.json, apps/*/package.json, docs/package.json
Updates engines.node across all app packages to >=22.0.0. Updates root runtime engines to Node ^24.15.0 and Bun ^1.3.13. Adds @types/bun to pi package. Removes docs:api script from docs build and dev targets.
Configuration schema and singleThread flag
apps/ccusage/config-schema.json
Adds singleThread boolean option (default false) to defaults and all command sections to disable parallel JSONL loading.
Documentation updates removing API reference and jq
docs/.vitepress/config.ts, docs/{CLAUDE.md,tsconfig.json,typedoc.config.ts,update-api-index.ts}, docs/guide/**
Removes typedoc config and API index generation. Removes "API Reference" nav and "Library Usage" guide. Updates CLI options/config examples to use jq piping instead of --jq flag. Documents CCUSAGE_BUN_AUTO_RUN environment variable. Updates all docs to clarify LOG_LEVEL (remove consola reference).
Internal package export and documentation updates
packages/internal/{CLAUDE.md,package.json,src/**}, pnpm-workspace.yaml, CLAUDE.md
Exports new modules from internal package (./array, ./json-file-state, ./sort). Removes consola dependency. Updates workspace catalog versions. Adds code-style guideline requiring JSDoc and tests on utility functions.
CI workflow and environment setup
.github/workflows/ci.yaml, flake.nix
Updates CI to create Claude directories and run E2E setup. Adds hyperfine to Nix dev shell.

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly Related PRs

  • ryoppippi/ccusage#469: Added --jq feature with processWithJq implementation across commands; this PR removes that entire feature by deleting the processor, removing jq from config/args, and updating all commands to output via writeStdoutLine instead.

  • ryoppippi/ccusage#605: Earlier date-utils extraction; this PR continues by refactoring to re-export sortByDate from internal package and adding createCachedDateFormatter/getDateStringWeek.

  • ryoppippi/ccusage#851: Earlier runtime engine migration; this PR updates Bun from ^1.3.9 to ^1.3.13 and Node to ^24.15.0 as part of continued dependency alignment.

Poem

Refactored deeply, the CLI takes flight 🚀
With Bun now swift and sorting done right,
jq bids farewell, while writeStdoutLine shines bright,
New benchmarks measure performance delight! ✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 12, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
ccusage-guide 36bd6fa May 15 2026, 10:12 AM

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 12, 2026

Open in StackBlitz

@ccusage/amp

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/amp@984

ccusage

npx https://pkg.pr.new/ryoppippi/ccusage@984

@ccusage/codex

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/codex@984

@ccusage/opencode

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/opencode@984

@ccusage/pi

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/pi@984

commit: 36bd6fa

ryoppippi added 19 commits May 12, 2026 02:22
Parallelise JSONL file loading across daily, session, and blocks reports using os.availableParallelism() by default. This keeps the worker count tied to the host CPU instead of a fixed cap, while --single-thread and CCUSAGE_JSONL_READ_CONCURRENCY provide deterministic overrides for benchmarking and constrained environments.

Small JSONL files are now read in one buffered pass, with the existing streaming path retained for files larger than 32 MiB. Cost calculation also keeps a synchronous fast path when --offline or cached costUSD values make async pricing unnecessary.

Dedupe is still applied after each file result is collected, preserving the original file-order replacement behaviour and keeping JS/Rust token totals identical for daily, session, monthly, and blocks JSON output.

Benchmark on Apple M3 Pro, Node 24.15.0, os.availableParallelism()=11, built JS CLI at apps/ccusage/dist/index.js, Rust release CLI at /Users/ryoppippi/ghq/github.com/ryoppippi/ccusage/target/release/ccusage, LOG_LEVEL=0, --offline --json, mitata min_samples=5:

JS: daily 3.51s, session 3.07s, monthly 3.17s, blocks 6.51s. Rust: daily 640.16ms, session 639.67ms, monthly 660.97ms, blocks 649.93ms. Previous JS PR baseline from bd7d2d4: daily 3.85s, session 3.99s, monthly 3.94s, blocks 8.36s.

Verified with pnpm run format, pnpm typecheck, pnpm --filter ccusage exec vitest run src/data-loader.ts, pnpm --filter ccusage run build, pnpm run test, and JS/Rust token-total parity checks for daily/session/monthly/blocks.
Load session block JSONL files once and collect each file earliest timestamp during the same pass that parses usage entries. This removes the previous sortFilesByTimestamp pre-scan from blocks, which read every file before reading the same files again for aggregation.

The standalone getEarliestTimestamp helper now shares the same fast timestamp extraction path, falling back to JSON.parse only for non-compact timestamp lines.

Benchmark on Apple M3 Pro, built JavaScript CLI, LOG_LEVEL=0, blocks --offline --json, hyperfine --warmup 1 --runs 5. Process checks showed no hyperfine/mitata/zig benchmark process before the run; cmux and WindowServer were active, so treat the absolute number as local noisy data.

JS blocks current: 3.347s ± 0.044s. Previous JS baseline from f5020ef was 6.51s, so this is about 1.95x faster for blocks.

Verified with pnpm run format, pnpm typecheck, pnpm --filter ccusage exec vitest run src/data-loader.ts, pnpm --filter ccusage run build, pnpm run test, and JS/Rust token-total parity checks for daily/session/monthly/weekly/blocks.
Allow the string-field parser to handle normal usage lines that include message.content. The local Claude data set has 162,838 usage lines and all of them include content, so the previous default sent effectively every usage line through JSON.parse before the lightweight object validation path.

API error lines still fall back to the full JSON path because usage limit reset extraction needs message content, and malformed non-array content still falls back instead of being accepted by the fast parser.

Reliable benchmark data is not included in this commit. An attempted hyperfine run was aborted after a concurrent Zig build started, and subsequent process checks still showed cmux at roughly 190-200% CPU. The previous measured JS blocks result remains 3.347s ± 0.044s from 6e876b6.

Verified with pnpm run format, pnpm typecheck, pnpm --filter ccusage exec vitest run src/data-loader.ts, pnpm --filter ccusage run build, pnpm run test, and JS/Rust token-total parity checks for daily/session/monthly/weekly/blocks.
Replace tinyglobby usage in the ccusage hot path with a small native readdir-based JSONL walker. The loader now avoids starting the glob engine for daily, session-by-id, block, and debug mismatch discovery, keeps deterministic file order with path sorting, and removes tinyglobby from the ccusage app dependencies.

Reliable runtime benchmark data is not included in this commit because process checks still showed cmux at roughly 150% CPU and another node ccusage process. The built bundle remains 515.86 kB total after this change.

Verified with pnpm run format, pnpm typecheck, pnpm --filter ccusage exec vitest run src/data-loader.ts src/debug.ts, pnpm --filter ccusage run build, pnpm run test, and JS/Rust token-total parity checks for daily/session/monthly/weekly/blocks.
Cache formatted usage dates per UTC hour while loading reports. The hot path previously created a Date and formatted through Intl for every usage line; daily, session, and block loading now reuse a per-command formatter/cache so repeated entries in the same hour do not pay that cost again.

Reliable runtime benchmark data is not included in this commit because process checks still showed cmux at roughly 150% CPU and Arc at roughly 60% CPU. The built bundle is 516.30 kB total after this change.

Verified with pnpm run format, pnpm typecheck, pnpm --filter ccusage exec vitest run src/data-loader.ts src/_date-utils.ts src/debug.ts, pnpm --filter ccusage run build, pnpm run test, JS/Rust token-total parity checks for daily/session/monthly/weekly/blocks, and timezone parity checks for UTC, Asia/Tokyo, America/New_York, and Europe/London.
Replace the repeated field-specific null scans in the usage-line fast parser with one unsupported-null regular expression. This preserves the same fallback for fields that the fast parser cannot safely treat as absent, while avoiding a dozen full-line includes checks on the common path.

Local data has many unrelated JSON nulls such as stop_sequence and stop_reason; the unsupported-null regex only matched speed:null in 283 lines, so normal content-bearing usage lines still stay on the fast path.

Reliable runtime benchmark data is not included in this commit because process checks still showed cmux around 180% CPU. The built bundle is 516.16 kB total after this change.

Verified with pnpm run format, pnpm typecheck, pnpm --filter ccusage exec vitest run src/data-loader.ts src/_date-utils.ts src/debug.ts, pnpm --filter ccusage run build, pnpm run test, JS/Rust token-total parity checks for daily/session/monthly/weekly/blocks, and timezone parity checks for UTC, Asia/Tokyo, America/New_York, and Europe/London.
Avoid allocating a fresh totals object for every usage entry while aggregating reports. Model aggregation and report totals now mutate per-group accumulators, reducing allocation and GC pressure in daily, monthly, weekly, session, and blocks summaries.

Reliable runtime benchmark data is not included in this commit because process checks still showed cmux around 210% CPU. The built bundle is 515.99 kB total after this change.

Verified with pnpm run format, pnpm typecheck, pnpm --filter ccusage exec vitest run src/data-loader.ts src/_date-utils.ts src/debug.ts, pnpm --filter ccusage run build, pnpm run test, JS/Rust token-total parity checks for daily/session/monthly/weekly/blocks, and timezone parity checks for UTC, Asia/Tokyo, America/New_York, and Europe/London. Commit hook was skipped because lint-staged hit pnpm workspace-state JSON parsing errors after these checks had already passed manually.
No file changes. This records a directional benchmark after commits 8051465, 641f7a1, 335331e, 5e2fefd, and 99e1607.

Measured on May 12, 2026 with comma-provided hyperfine, Apple M3 Pro, built JavaScript CLI via node, Rust release binary, LOG_LEVEL=0, --offline --json, --warmup 1 --runs 3. Process checks still showed cmux around 180% CPU, so these are noisy and should not be treated as final release numbers.

daily: JS 3.247s ± 0.206s, Rust 515.2ms ± 17.4ms, Rust 6.30x faster.

session: JS 3.120s ± 0.020s, Rust 504.5ms ± 52.8ms, Rust 6.19x faster.

monthly: JS 3.115s ± 0.057s, Rust 551.7ms ± 14.8ms, Rust 5.65x faster.

blocks: JS 3.407s ± 0.018s, Rust 536.0ms ± 33.9ms, Rust 6.36x faster; hyperfine reported statistical outliers.
Replace the separate per-group passes for model breakdowns, totals, and model lists with a single summarizeUsageEntries pass. This keeps Object.groupBy, which benchmarked better than a direct mutable group Map variant on the current workload, while avoiding repeated scans inside each daily and session group.

Benchmark: noisy local run with cmux around 185% CPU, hyperfine --warmup 1 --runs 3, built JS via node vs Rust release. daily: JS 3.090s ± 0.010s, Rust 582.0ms ± 22.1ms, Rust 5.31x faster. session: JS 3.413s ± 0.475s, Rust 614.3ms ± 20.8ms, Rust 5.56x faster. monthly: JS 3.106s ± 0.027s, Rust 618.5ms ± 45.6ms, Rust 5.02x faster. blocks: JS 3.525s ± 0.130s, Rust 633.4ms ± 19.4ms, Rust 5.57x faster.

Bundle size after build: total 515.18 kB, dist/index.js 128.62 kB, data-loader chunk 175.10 kB.

Validation: pnpm run format; pnpm typecheck; pnpm --filter ccusage exec vitest run src/data-loader.ts; pnpm run test; pnpm --filter ccusage run build; JS/Rust JSON parity for daily/session/monthly/weekly/blocks.
Avoid parsing each JSONL line timestamp before parsing usage data in loadSessionBlockData. Valid usage rows now reuse data.timestamp for file ordering and block entry construction, while non-usage and malformed usage lines still use the existing timestamp fallback.

Benchmark: noisy local blocks-only run with cmux around 175% CPU, hyperfine --warmup 1 --runs 5, built JS via node vs Rust release. blocks: JS 3.985s ± 0.153s, Rust 764.9ms ± 103.4ms, Rust 5.21x faster. Treat absolute numbers as noisy due system load.

Bundle size after build: total 515.58 kB, dist/index.js 128.62 kB, data-loader chunk 175.50 kB.

Validation: pnpm run format; pnpm typecheck; pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern loadSessionBlockData; pnpm run test; pnpm --filter ccusage run build; JS/Rust JSON parity for daily/session/monthly/weekly/blocks.
Refs #984.

Keep JSONL line callbacks synchronous on the immediate-cost path and collect the rare deferred cost calculations per file. This avoids allocating an async callback promise for every parsed usage row while preserving entry order with sparse slots for deferred rows.

Noisy benchmark with cmux around 217% CPU after build: js daily 3.379s ± 0.295s, rust daily 441.7ms ± 13.9ms; js session 3.970s ± 0.962s, rust session 438.0ms ± 35.6ms; js blocks 3.344s ± 0.009s, rust blocks 496.7ms ± 32.6ms.

Validation: pnpm run format; pnpm typecheck; pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern loadDailyUsageData|loadSessionData|loadSessionBlockData|loadSessionUsageById; pnpm --filter ccusage run build; pnpm run test; built JS/Rust JSON token and cost parity matched for daily, session, monthly, weekly, and blocks.
Refs #984.

Replace per-line trim() checks in JSONL loading with a char-code whitespace scan. Normal usage rows no longer allocate a trimmed string before parsing, while blank and whitespace-only lines are still skipped.

Noisy benchmark with cmux around 142% CPU and Arc around 92% CPU after build: js daily 3.073s ± 0.012s, rust daily 467.0ms ± 83.9ms; js blocks 3.385s ± 0.008s, rust blocks 457.2ms ± 65.8ms.

Validation: pnpm run format; pnpm typecheck; pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern processJSONLFileByLine|loadDailyUsageData|loadSessionData|loadSessionBlockData; pnpm --filter ccusage run build; pnpm run test; built JS/Rust JSON token and cost parity matched for daily, session, monthly, weekly, and blocks.
Refs #984.

Use node worker_threads for built JavaScript usage loading when at least 64 JSONL files are present. The worker count follows os.availableParallelism() - 1 by default, respects --single-thread, and can be disabled or capped with CCUSAGE_JSONL_WORKER_THREADS. Source and vitest execution keep the existing single-process path.

Noisy no-worker comparison with cmux around 169% CPU: workers daily 2.038s ± 0.046s vs no-workers daily 3.114s ± 0.089s; workers session 2.098s ± 0.054s vs no-workers session 3.110s ± 0.009s; workers blocks 2.350s ± 0.161s vs no-workers blocks 3.378s ± 0.010s.

Noisy Rust comparison after build: workers daily 2.030s ± 0.065s vs rust daily 437.4ms ± 16.7ms; workers session 2.071s ± 0.030s vs rust session 464.9ms ± 11.5ms; workers blocks 2.215s ± 0.038s vs rust blocks 566.3ms ± 31.6ms. Rust remains about 4.6-5.1x faster.

Build size: pnpm --filter ccusage run build reports 520.18 kB total, with dist/index.js 128.62 kB and the data-loader chunk 180.11 kB.

Validation: pnpm run format; pnpm typecheck; pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern processJSONLFileByLine|loadDailyUsageData|loadSessionData|loadSessionBlockData; pnpm --filter ccusage run build; pnpm run test; built JS/Rust JSON token and cost parity matched for daily, session, monthly, weekly, and blocks.
Refs #984.

Stop sending full UsageData objects from usage worker threads. Workers now return compact entries containing usage, model, cost, version, and dedupe metadata only, and the main thread performs replacement decisions from token totals and speed presence. This reduces structured clone payloads for built JavaScript worker loading.

Cap the default worker count at 4 while still considering os.availableParallelism() and file count. A short noisy worker-count sweep showed daily w4 1.810s ± 0.024s, w6 1.871s ± 0.072s, w8 1.915s ± 0.048s, default-before-cap 2.078s ± 0.089s; blocks w4 1.972s ± 0.021s, w6 2.117s ± 0.170s, w8 2.079s ± 0.076s, default-before-cap 2.175s ± 0.050s.

Latest noisy comparison after the cap with cmux around 169% CPU: js daily 2.091s ± 0.064s, no-workers daily 3.073s ± 0.013s, rust daily 402.5ms ± 110.9ms; js session 2.383s ± 0.253s, rust session 561.0ms ± 71.5ms; js blocks 2.281s ± 0.094s, no-workers blocks 3.783s ± 0.267s, rust blocks 579.5ms ± 34.8ms.

Build size: pnpm --filter ccusage run build reports 520.66 kB total, with dist/index.js 128.62 kB and the data-loader chunk 180.59 kB.

Validation: pnpm run format; pnpm typecheck; pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern processJSONLFileByLine|loadDailyUsageData|loadSessionData|loadSessionBlockData; pnpm --filter ccusage run build; pnpm run test; built JS/Rust JSON token and cost parity matched for daily, session, monthly, weekly, and blocks.
Raise the buffered JSONL read threshold from 32 MiB to 128 MiB so large local Claude logs avoid the readline streaming path. CPU profiling showed the previous threshold pushed the largest files through string_decoder/readline newline regex work, while the Rust implementation reads these files whole.

Local data shape:

- 3 JSONL files exceed 32 MiB: 87.0 MiB, 69.5 MiB, and 43.9 MiB

- total local JSONL corpus: 3124 files, 1.2G

Validation:

- pnpm run format

- pnpm typecheck

- pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern processJSONLFileByLine|loadDailyUsageData|loadSessionData|loadSessionBlockData

- pnpm --filter ccusage run build

- pnpm run test

- JS/Rust JSON token and cost parity: daily, session, monthly, weekly, blocks

Benchmarks (noisy; cmux ~173% CPU):

- JS daily: 1.434s ± 0.050s

- JS daily no-worker: 2.887s ± 0.008s

- Rust daily: 396.4ms ± 21.8ms

- JS session: 1.537s ± 0.038s

- JS session no-worker: 2.942s ± 0.006s

- Rust session: 387.3ms ± 23.3ms

- JS blocks: 1.705s ± 0.138s

- JS blocks no-worker: 3.190s ± 0.015s

- Rust blocks: 392.0ms ± 14.6ms
CPU profiling showed the unsupported-null regular expression still ran for every fast-parser candidate. Guard it with a cheap ':null' substring check so normal usage rows stay on simple string scans and only possible null-bearing rows pay for the regex fallback check.

Validation:

- pnpm run format

- pnpm typecheck

- pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern processJSONLFileByLine|loadDailyUsageData|loadSessionData|loadSessionBlockData|parseUsageDataLine

- pnpm --filter ccusage run build

- pnpm run test

- JS/Rust JSON token and cost parity: daily, session, monthly, weekly, blocks

Benchmarks (noisy; cmux ~178% CPU):

- JS daily: 1.396s ± 0.051s

- JS session: 1.478s ± 0.059s

- JS blocks: 1.466s ± 0.002s

- Rust daily: 353.0ms ± 6.8ms

- Rust session: 367.8ms ± 30.3ms

- Rust blocks: 418.6ms ± 48.0ms
Cache successful and missing LiteLLM model pricing lookups by model name. The ccusage hot path calculates token cost for every row when costUSD is absent, so this avoids rebuilding provider-prefix candidates and rescanning the pricing map for repeated models.

Validation: pnpm run format; pnpm typecheck; pnpm --filter @ccusage/internal exec vitest run src/pricing.ts; pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern loadDailyUsageData|loadSessionData|loadSessionBlockData|processJSONLFileByLine|loadSessionUsageById|calculateCostForEntry; pnpm --filter ccusage run build; pnpm run test. JS/Rust parity matched for daily/session/monthly/weekly/blocks token and cost fields.

A/B benchmark with worker-enabled built dist paths, local Claude data, LOG_LEVEL=0, --offline --json, hyperfine --warmup 1 -r 5: daily 1.369s ± 0.070s -> 1.291s ± 0.005s; session 1.414s ± 0.053s -> 1.379s ± 0.017s; blocks 1.547s ± 0.093s -> 1.430s ± 0.021s. Bundle total 520.69 kB -> 521.06 kB.
Switch tsdown from dce-only minification to full minification for the ccusage bundled CLI. This keeps the runtime code path unchanged while substantially reducing the distributed JavaScript payload.

Bundle size: pnpm --filter ccusage run build total 521.06 kB -> 350.81 kB. Main entry 128.62 kB -> 68.14 kB; data-loader chunk 180.98 kB -> 108.98 kB.

Benchmark note: hyperfine A/B was run with both base and minified builds under heavy Chrome/Arc system load, using paths that preserve the /dist/ worker-thread gate. Results were noisy but did not show a stable regression: daily 4.902s -> 4.794s, session 8.933s noisy/outlier -> 4.542s, blocks 4.586s -> 4.680s.

Validation: pnpm run format; pnpm typecheck; pnpm run test; pnpm --filter ccusage run build; JS/Rust token and cost parity matched for daily/session/monthly/weekly/blocks in offline mode.
Replace the unsupported-null fast parser regex with a targeted scan over :null occurrences. This avoids running a large alternation regexp over every line that contains null while preserving the fallback behaviour for nullable fields that the fast path cannot represent.

On the local Claude JSONL dataset, the isolated null-field check improved from regex median 615.557ms to scan median 370.318ms over 171,025 :null lines. The built JS bundle remains about 351.08 kB total.

Validated with pnpm run format, pnpm typecheck, pnpm run test, pnpm --filter ccusage run build, and JS/Rust token and cost parity for daily, session, monthly, weekly, and blocks.
ryoppippi added 3 commits May 15, 2026 06:22
Daily worker results are also emitted per JSONL file, so project is identical for every encoded row in that payload. Store project once on the encoded object and keep only date/model/hash in the per-row string side array.

Built Bun A/B on real local Claude data with LOG_LEVEL=0 COLUMNS=200 and hyperfine --warmup 3 --runs 10 measured daily --offline --json at 342.7ms +/- 9.7ms before and 331.3ms +/- 11.4ms after, about 1.03x faster. A swapped-order rerun measured 333.6ms +/- 10.9ms after versus 353.1ms +/- 18.6ms before. Session stayed tied/noisy. Build output is 243.76 kB, still below the pre-worker-metadata baseline.

Output parity matched for daily/session/monthly/weekly byte-for-byte, and blocks matched after removing the time-dependent projection field. Validation: pnpm run format, targeted daily/session data-loader tests, pnpm typecheck, pnpm run test, and pnpm --filter ccusage build.
Inline the daily/session global dedupe merge that runs after JSONL workers return their per-file payloads. The previous generic helper shape still did one Map lookup and then called a second helper for new rows; keeping the replacement check directly in the merge closure avoids those hot-path calls and lets the unused generic helper drop out of the bundle.

Built Bun A/B on real local Claude data with LOG_LEVEL=0, COLUMNS=200, hyperfine --warmup 3 --runs 10:

- daily --offline --json: 324.1ms +/- 12.2ms baseline vs 320.0ms +/- 7.5ms inline; swapped order 316.5ms +/- 6.0ms baseline vs 321.9ms +/- 9.1ms inline, so daily is treated as tied/noisy.

- session --offline --json: 333.9ms +/- 9.0ms baseline vs 323.6ms +/- 4.1ms inline; swapped order 338.7ms +/- 14.0ms baseline vs 331.2ms +/- 6.3ms inline, about 1.02-1.03x faster.

- Build output: 243.76 kB baseline vs 243.55 kB inline.

Validation: pnpm run format; pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern loadDailyUsageData|loadSessionData|deduplication; pnpm typecheck; pnpm run test; pnpm --filter ccusage build. Output parity matched byte-for-byte for daily/session/monthly/weekly JSON against the previous built dist.
Use an explicit result length counter for the main-thread usage merge arrays instead of Array#push. The merge paths are hot after worker parsing: daily/session append deduped row payloads before grouping, and blocks append LoadedUsageEntry objects before billing-block identification.

Built Bun A/B on real local Claude data with LOG_LEVEL=0, COLUMNS=200, hyperfine --warmup 3 --runs 10:

- daily --offline --json: 326.3ms +/- 6.5ms baseline vs 316.2ms +/- 5.8ms indexed; swapped order 327.9ms +/- 19.4ms baseline vs 322.6ms +/- 7.3ms indexed.

- session --offline --json: 328.8ms +/- 4.8ms baseline vs 324.6ms +/- 5.6ms indexed; swapped order 329.7ms +/- 6.9ms baseline vs 322.4ms +/- 8.2ms indexed.

- blocks --offline --json: 367.3ms +/- 4.4ms baseline vs 369.8ms +/- 11.9ms indexed in the first order, then 372.5ms +/- 6.2ms baseline vs 366.7ms +/- 5.0ms indexed in swapped order; treated as tied/slight win.

- Build output: 243.55 kB baseline vs 243.69 kB indexed.

Validation: pnpm run format; pnpm --filter ccusage exec vitest run src/data-loader.ts --testNamePattern loadDailyUsageData|loadSessionData|loadSessionBlockData|deduplication; pnpm typecheck; pnpm run test; pnpm --filter ccusage build. Output parity matched previous built dist for daily/session/monthly/weekly JSON byte-for-byte and blocks JSON after removing time-dependent projection.
ryoppippi added 11 commits May 15, 2026 07:59
Move createResultSlots into @ccusage/internal/array so the sparse indexed-fill allocation helper can be reused outside the ccusage loader. The helper keeps its JSDoc in the internal package and now has focused in-source tests for sparse slots and zero-length allocation.

Add guidance to AGENTS/CLAUDE instructions that utility functions should carry purpose-oriented JSDoc and focused tests. This matches the current perf work where small helpers encode non-obvious allocation or timestamp choices.

Document and test the session-block timestamp helpers. getEntryTimestampMs exists to reuse JSONL parser timestampMs values on hot ordering/grouping paths while preserving Date#getTime fallback behaviour for manually-created entries. floorToHourMs keeps the block-start calculation as integer millisecond arithmetic.

Refs #984
Enable tsdown minification for the ccusage build while keeping sourcemaps in local dist output. The published package files list now includes only dist/**/*.js plus config-schema.json, so sourcemaps remain available after local builds but are not included in the package artifact.

The valid /dist-path A/B run measured minified output slightly faster or effectively tied: daily 344.2ms -> 342.0ms, session 368.4ms -> 351.0ms, and blocks 397.9ms -> 389.8ms with bun -b, LOG_LEVEL=0, COLUMNS=200, --offline --json. Current built output is 1.08 MB on disk including maps; packed ccusage artifact is measured separately with pnpm pack.

Refs #984
Update compare-pr-performance to use gunshi for argument parsing and Bun APIs for process execution and file writes. The script now measures the package artefact with pnpm pack instead of raw dist size, which reflects package.json#files and excludes local sourcemaps from the reported download size.

Use fs-fixture for temporary pack directories so cleanup follows the same disposable fixture pattern used elsewhere in tests. The script restores package.json after pnpm pack because the prepack pipeline rewrites it for publish.

Add shebangs and executable bits to ccusage scripts so direct execution is consistent: compare-pr-performance.ts and generate-json-schema.ts use Bun, while bench.mjs keeps the Node runtime it benchmarks through process.execPath.

Self-check: pnpm --filter ccusage exec bun scripts/compare-pr-performance.ts --base-dir ../.. --head-dir ../.. --fixture-dir test/fixtures/claude --warmup 0 --runs 1 completed and produced the perf markdown table. Refs #984
Tested a prototype that merged daily/session encoded worker columns directly into the final dedupe list before materialising discarded duplicate rows.

Correctness was fine: daily, session, monthly, and weekly JSON output matched the current dist build byte-for-byte on real local Claude data with LOG_LEVEL=0 and COLUMNS=200.

The benchmark did not justify the added complexity. First same-run hyperfine pass: daily base 342.1ms vs prototype 368.3ms, session base 365.9ms vs prototype 343.7ms. Second pass with more warmup/runs: daily base 331.4ms vs prototype 332.2ms, session base 341.0ms vs prototype 358.8ms.

Because the result was mixed to slower and the implementation made the worker merge path more indirect, the prototype was reverted before committing code. Keep the simpler existing merge until a direct aggregation approach shows a stable win.

Refs #984
Prototype: decode encoded block worker rows only after global dedupe accepts or replaces a row, instead of reconstructing every BlockEntryResult before merge.

Correctness: blocks --offline --json output matched byte-for-byte against the current build on real local Claude data.

Benchmark: hyperfine --warmup 5 --runs 12, LOG_LEVEL=0, COLUMNS=200, built Bun dist. Base blocks --offline --json was 392.0ms +/- 7.8ms; prototype was 391.6ms +/- 4.1ms, effectively unchanged.

Decision: not adopted. The profile made the decode loop look suspicious, but the end-to-end result did not justify the extra branch and row-level decoder path.
Bun could truncate large CLI output at 64 KiB when stdout was piped and the process exited immediately after console.log. This affected commands like daily --offline --json | jq, making some pipe-based checks and benchmarks read incomplete JSON.

Add an awaited stdout line writer that resolves on the stream write callback, and use it for final JSON/table report output across ccusage commands while leaving small statusline/log output unchanged.

CI now validates the preview package with daily --offline --json | jq -e . for both Node-forced and Bun-available paths.

Validation: pnpm run format; pnpm typecheck; pnpm run test; pnpm --filter ccusage build; built Bun daily/blocks --offline --json piped through jq -e .

Post-fix local benchmark, Apple M3 Pro macOS, Bun built entry, LOG_LEVEL=0, COLUMNS=200, hyperfine --warmup 4 --runs 10: daily --offline --json 336.7ms +/- 8.3ms; session --offline --json 346.2ms +/- 7.3ms; blocks --offline --json 389.7ms +/- 8.8ms; daily --offline table 354.4ms +/- 22.1ms.
The preview E2E fixture previously produced only about 2 KiB of daily JSON, so daily --offline --json | jq -e . would not catch Bun pipe truncation at 64 KiB.

Generate 420 daily buckets in the temporary Claude fixture so the CI daily JSON is around 350 KiB. This keeps the plain daily output visible in CI while making the jq pipe check cover large stdout flushing.

Validation: pnpm run format; local e2e setup with built Bun daily --offline --json | jq -e . produced 349660 bytes.
Replace the hot string-key dedupe maps with null-prototype object indexes for daily, session, and blocks loading. The keys are Claude message/request IDs and only need exact lookup, while a Bun CPU profile showed native Map#set dominating the post-worker merge after parsing had already moved off the main thread.

The null prototype keeps inherited keys such as __proto__ safe; an in-source test covers that behavior. Block loading keeps the same replacement metadata shape, just stored behind the string index.

Benchmark on real local Claude data, LOG_LEVEL=0, COLUMNS=200, built Bun dist, hyperfine --warmup 4 --runs 10: daily --offline --json 336.7ms -> 329.4ms, session --offline --json 346.2ms -> 342.5ms, blocks --offline --json 389.7ms -> 387.3ms.

Validation: pnpm run format; pnpm typecheck; pnpm run test; pnpm --filter ccusage build; built Bun daily/blocks --offline --json piped through jq -e.

Refs #984
Tried worker-side chunk dedupe for daily/session rows before postMessage.

The prototype removed duplicates inside each worker chunk, carried source-order metadata for replacement tie-breaks, and tried to restore first-occurrence merge order before aggregation.

It was rejected because JSON output was not byte-identical: token totals stayed stable, but cost fields changed at the IEEE754 tail due to aggregation order differences. A first attempt also changed session totals because session/project metadata is encoded at the file payload level, so moving a winning row into the first-seen file slot is not semantics-preserving.

Performance also regressed on the real local Claude corpus with built Bun dist, LOG_LEVEL=0, COLUMNS=200, hyperfine --warmup 4 --runs 10: daily --offline --json 363.1ms ± 13.5ms, session --offline --json 362.3ms ± 12.8ms, blocks --offline --json 411.9ms ± 25.2ms.

The previously committed object-index build remains faster and byte-compatible: daily 329.4ms ± 5.2ms, session 342.5ms ± 13.3ms, blocks 387.3ms ± 7.0ms.

Validation after reverting the prototype: rebuilt ccusage and confirmed daily/session/blocks JSON output is byte-identical to the pre-experiment baseline.
Replace per-summary Map/Set model aggregation with null-prototype object indexes plus an insertion-order model list.

Why: after the row dedupe Map optimization, summary aggregation still allocates Map and Set instances for each daily/session/bucket group even though keys are plain model-name strings and only exact-key lookup is needed. The object index follows the same null-prototype safety pattern as the dedupe index while preserving first-seen modelsUsed order.

Correctness: added coverage for the subtle unknown-model behavior where an undefined model contributes an unknown modelBreakdown but does not appear in modelsUsed until the model name is explicit.

Output parity: built Bun dist matched the pre-change output byte-for-byte for daily, session, monthly, weekly, and blocks --offline --json. daily/session table output was also checked before final validation and matched byte-for-byte.

Performance: worker-enabled A/B using /tmp/*/dist/index.js, real local Claude data, LOG_LEVEL=0, COLUMNS=200, hyperfine --warmup 4 --runs 12: daily base 348.9ms ± 12.7ms vs object 349.5ms ± 20.0ms; session base 354.1ms ± 11.8ms vs object 348.5ms ± 4.0ms; blocks base 400.9ms ± 19.9ms vs object 396.0ms ± 11.3ms. This is neutral for daily and a small win for session/blocks under a noisy host.

Validation: pnpm run format; pnpm typecheck; pnpm run test (32 files, 377 passed, 1 skipped); pnpm --filter ccusage build; built Bun daily/blocks --offline --json piped through jq -e.
Add a Bun script that generates a deterministic Claude config fixture with one JSONL file around 1 GiB. The rows keep unique request and message identifiers so deduplication cannot shrink the workload, while token and timestamp values still exercise report aggregation.

Extend the ccusage perf comment script so it can render multiple fixture sections. CI now keeps the existing committed fixture benchmark for stable small-data feedback and adds a generated large single-file benchmark for daily only. The large fixture deliberately skips session and blocks because base-branch scans over 1 GiB are too slow for PR feedback.

The compare script now writes progress lines to stderr for each fixture, command, side, warmup, sample, and median result. This makes Actions logs show whether time is being spent in base or PR measurements instead of appearing hung during the 1 GiB scan.

Local validation: pnpm run format; pnpm typecheck; pnpm run test before the daily-only follow-up; pinact run with GITHUB_TOKEN; generated a 1024 MiB fixture in about 1.64s; verified daily/session/blocks JSON output through jq; generated a head-vs-head perf comment with the 1 GiB fixture. Follow-up validation: pnpm run format; pnpm typecheck; small generated fixture smoke confirmed progress output and daily-only large markdown.
ryoppippi added 6 commits May 15, 2026 10:08
Add hyperfine to the Nix development shell so CI and local benchmarking can use the same external-process benchmark tool. This is a better fit for ccusage CLI E2E comparisons than mitata because the workload is a full command invocation rather than an in-process JavaScript microbenchmark.

Validation: nix develop --command hyperfine --version reported hyperfine 1.20.0.
Replace the mitata-based PR performance measurement with hyperfine. The CI workload benchmarks full ccusage CLI process invocations, so hyperfine is a better fit than an in-process JavaScript microbenchmark helper.

The script now runs base and PR commands in one hyperfine invocation per ccusage command, exports hyperfine JSON, and renders the same PR comment table from median timings. Hyperfine stdout/stderr are inherited so Actions logs show benchmark progress and relative speed output while the generated markdown remains stable.

The measured command still uses the published runtime shape through pnpm-managed Bun: pnpm exec bun -b apps/ccusage/dist/index.js. The benchmark output is piped rather than redirected to /dev/null so stdout-heavy CLI behaviour stays representative.

Validation: pnpm run format; pnpm typecheck; pnpm run test; small generated fixture smoke with nix develop --command pnpm exec bun apps/ccusage/scripts/compare-pr-performance.ts confirmed hyperfine progress logs and markdown output.
Pad each generated Claude usage row with message content so the 1 GiB CI fixture has a realistic row count. The first generated version produced about 3.2 million tiny rows, which over-weighted per-line parser and aggregation costs and made the base branch measurement much slower than a real large Claude log.

With 6 KiB content padding, the 1024 MiB fixture contains about 164,764 rows and still exercises a single large JSONL file. That keeps the benchmark focused on large-file behaviour without creating an artificial millions-of-rows workload.

Validation: pnpm run format; pnpm typecheck; generated a 1 MiB fixture and verified daily JSON with jq; generated a 1024 MiB fixture in about 0.84s with 164,764 rows.
Replace the previous single-file 1 GiB fixture with a deterministic multi-file fixture shaped from aggregate-only local Claude log statistics. The profile stores no prompts, paths, outputs, or JSONL contents; it only records file count, total size, row count, average row size, and file-size quantiles.

The generated CI fixture now scales those aggregate distribution points to the requested target size. For a 1024 MiB fixture, local generation produced 2,597 JSONL files, 292,302 rows, and 1,030.95 MiB. This better represents a real large ccusage workload than one huge file or millions of tiny rows.

Update the perf comment wording from single-file streaming to real-world-shaped multi-file loading, while keeping the large fixture benchmark to daily only so base-branch CI remains bounded.

Validation: pnpm run format; pnpm typecheck; pnpm run test; 1 MiB generated fixture daily JSON verified with jq; 1024 MiB fixture generated in 1.46s with 2,597 files and 292,302 rows; head-vs-head hyperfine smoke completed against the 1024 MiB fixture.
Remove pnpm exec from the command strings passed to hyperfine. The script can still be launched through pnpm exec bun in CI, but measured commands now use the Bun executable already running the script via process.execPath.

This keeps CI benchmark comments focused on ccusage runtime rather than package-manager lookup overhead. Local 1GiB shaped-fixture smoke after the change measured base daily at 10.803s and PR daily at 494.1ms, about 21.86x faster.
Resolve the ccusage package bin from apps/ccusage/package.json and run that entry through the Bun executable already executing the CI script. This reflects the published command path instead of the internal index.js entry.

Use hyperfine --shell none so the benchmark command is parsed as argv instead of requiring hand-written shell quoting. Package-manager lookup remains outside the measured command, and package.json entrypoint differences between base and PR remain visible.

Package metadata is read with Bun.file rather than node:fs synchronous reads.
Repository owner deleted a comment from github-actions Bot May 15, 2026
@github-actions
Copy link
Copy Markdown

ccusage performance comparison

This compares the PR build against the base branch build on the same CI runner.

Committed fixture performance

Committed small fixture for stable PR-to-PR feedback and output-shape regressions.

Fixture: apps/ccusage/test/fixtures/claude
Runtime: package ccusage bin from apps/ccusage/package.json through bun -b, --offline --json, measured by hyperfine with 2 warmups and 7 runs.

Command Base median PR median PR vs base
daily --offline --json 77.7ms 82.5ms 0.94x
session --offline --json 80.2ms 79.5ms 1.01x
blocks --offline --json 74.8ms 79.5ms 0.94x

Large real-world-shaped fixture performance

Generated fixture around 1 GiB shaped from aggregate local Claude-log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixture.

Fixture: /home/runner/work/_temp/ccusage-large-fixture
Runtime: package ccusage bin from apps/ccusage/package.json through bun -b, --offline --json, measured by hyperfine with 0 warmups and 1 runs.

Command Base median PR median PR vs base
daily --offline --json 27.864s 1.402s 19.88x

Package size

Package artifact Base PR Delta Ratio
packed ccusage-*.tgz 133.73 KiB 43.29 KiB -90.44 KiB 3.09x

Lower medians and smaller packed package sizes are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees.

@ryoppippi ryoppippi marked this pull request as ready for review May 15, 2026 10:27
@ryoppippi ryoppippi merged commit b5ec88b into main May 15, 2026
20 of 21 checks passed
@ryoppippi ryoppippi deleted the perf branch May 15, 2026 10:27
ryoppippi added a commit that referenced this pull request May 16, 2026
Record that adapter source logic belongs under apps/ccusage/src/adapter while src/data-loader.ts remains a dedicated bundled worker entry for the Claude data-loader chunk introduced by PR #984.

This keeps future adapter migrations from adding root-level source shims while preserving the build entry needed by the optimized loader.
ryoppippi added a commit that referenced this pull request May 17, 2026
* refactor(ccusage): move all-agent loading into adapters

* perf(ccusage): route codex reports through fast adapter

* perf(ccusage): parallelize codex adapter loading

* refactor(ccusage): route agent commands through adapters

* chore(ai): exhaustiveness check

* chore(ai): top-level fs-fixture is ok

* docs(ccusage): document adapter architecture

Record the adapter layering model before continuing the migration. The new architecture note makes source detection, file discovery, parsing, aggregation, worker boundaries, and parent-process row return explicit so future agents can be implemented on the same ccusage foundation instead of deprecated wrapper packages.

Also clarify testing guidance: prefer parser, loader, path, pricing, aggregation, and CLI output tests over schema-only assertions when realistic fixture logs already exercise validation behavior.

* docs(gunshi): preserve args inference

Document that Gunshi command factories should keep the concrete args type inferred from define(). Avoid Command<Args> and broad ReturnType<typeof define> annotations because they erase ctx.values option keys and value types.

* refactor(ccusage): organize agent adapters by source

Move agent-specific implementation under adapter/<agent>/ so the unified ccusage CLI owns the runtime logic instead of the deprecated wrapper packages.

Add shared adapter definition and log aggregation helpers, move Claude and Codex into directory adapters, split Amp and pi-agent into path/parser/schema/pricing files, and add worker-backed parsing for Amp and pi-agent JSON sources.

Replace new adapter path checks with an internal safe directory helper and add focused parser/path tests using fs-fixture. The affected adapter tests are green for shared, Amp, pi-agent, Codex, and agent command paths.

* refactor(ccusage): split opencode adapter sources

Separate OpenCode path discovery, schema, loader, pricing, and row aggregation so the adapter follows the shared detect/load/parse/aggregate layering.

OpenCode has multiple source kinds, so JSON message files now use the shared worker gating and SQLite remains a separate DB source. The loader still de-duplicates JSON messages against DB rows and keeps JSON+SQL totals combined.

Path checks now use the internal safe directory helper, pricing fetcher disposal uses using, and new fs-fixture tests cover path discovery, JSON message parsing, ignored non-billable messages, and row aggregation.

* fix(ccusage): require adapter pricing fetchers

* perf(ccusage): benchmark claude and codex fixtures

* test(ccusage): snapshot direct agent commands

* fix(ccusage): address adapter review findings

* fix(ccusage): align codex adapter totals

Use the Codex total_tokens field for all-agent adapter rows so ccusage all and ccusage codex report the same total token count. Cached input tokens are an input breakdown, and reasoning output tokens are informational because Codex output_tokens already carries the billable output value.

Add skipped local-data smoke tests for Claude and Codex adapter loaders. These run on developer machines with real log directories while staying inert on clean CI environments.

* refactor(ccusage): split codex adapter primitives

Move Codex adapter path discovery, pricing, usage arithmetic, and shared types out of the large adapter index. This makes the adapter boundary match the other agent adapters and keeps future Codex parser work localized.

Codex detection now short-circuits when it finds the first JSONL session file instead of collecting every session path just to answer a boolean. Focused Codex adapter tests and tsgo typecheck pass.

* fix(ccusage): group claude subagent session logs

Resolve Claude session report paths with the session id from each entry when available, so flat session files and nested subagent logs aggregate under the same session.

This matches the real Claude projects layout where both project/session.jsonl and project/session/subagents/*.jsonl can exist for one conversation.

* refactor(ccusage): split codex parser worker

Move Codex JSONL parsing and worker fan-out into adapter/codex/parser.ts so the public adapter index only handles report aggregation and pricing wiring.

The focused Codex adapter tests and package typecheck confirm the extracted parser keeps the same behavior.

* refactor(ccusage): move claude loader into adapter

Move the Claude-specific data loader under adapter/claude and update internal imports to use the adapter path directly.

No re-export shim is kept because the package only exports the CLI entrypoint and all remaining data-loader references are internal.

* docs(ccusage): document adapter migration workflow

Record the adapter migration checklist for future agents, including direct adapter imports, shared ccusage foundations, local smoke tests, cmux validation, and benchmark parity.

* refactor(ccusage): share indexed adapter workers

Move the repeated worker fan-out, file-size chunking, and indexed result restoration into @ccusage/internal so Amp, Codex, OpenCode, and pi-agent use the same worker foundation as the optimized ccusage loader path.

This also keeps Codex totals aligned when raw token_count payloads omit total_tokens by including reasoning_output_tokens in the parser fallback. The CLI output snapshot is updated for the corrected Codex total semantics.

* docs(ccusage): clarify adapter worker entry ownership

Record that adapter source logic belongs under apps/ccusage/src/adapter while src/data-loader.ts remains a dedicated bundled worker entry for the Claude data-loader chunk introduced by PR #984.

This keeps future adapter migrations from adding root-level source shims while preserving the build entry needed by the optimized loader.

* build(ccusage): bundle adapter offline pricing macros

Move the Amp and Codex macro imports into their pricing modules and include those modules in the tsdown macro transform. This mirrors the standalone packages and prevents offline mode from retaining the runtime prefetch path in the bundled ccusage CLI.

The adapter command modules now receive stable offline pricing loaders instead of importing macro functions directly.

* test(ccusage): skip local claude smoke test in ci

CI can have a Claude projects directory without local usage rows, which makes the real-data smoke test fail even though it is only meant for developer machines with actual logs.

Keep the smoke test enabled locally, but skip it whenever CI=true so the deterministic fixture tests remain the CI contract.

* docs(internal): document indexed worker collection

Add API documentation for the shared indexed worker helper requested during PR review. The comment records the ordering guarantee and the zero-worker fallback so adapter callers can reason about the shared worker path without reading the implementation.

* fix(ccusage): share usage loading progress

Move the all-agent loading spinner into a reusable progress component and wire it through all agent report commands. This keeps every direct agent command on the same single-spinner behavior instead of letting per-agent spinners or LiteLLM pricing logs compete with terminal rendering.

Route LiteLLM pricing messages through the progress component while a TTY spinner is active. JSON output continues to keep stdout machine-readable and the verification covers all, claude, codex, opencode, amp, and pi JSON modes.

* perf(ccusage): share adapter file primitives

Move whole-file text reads and cheap source detection into @ccusage/internal/fs so all coding-agent adapters use the same IO foundation.

Amp and OpenCode now read JSON message files through readTextFile(), Codex reads config.toml through the same helper, and JSONL buffering reuses readBufferedTextFile(). Claude keeps its byte scanner fast path while using the shared text fallback and transcript reader.

Detection for Claude, Codex, Amp, OpenCode, and pi-agent now uses hasFileRecursive() where only existence is needed, avoiding full file list materialization. The adapter architecture notes now document the shared optimization baseline.

* test(ccusage): allow local codex smoke test to finish

The local Codex smoke test reads real user sessions when CODEX_HOME exists. On machines with substantial Codex history it can exceed Vitest\x27s default 5 second timeout even though CI skips the test on clean runners.

Keep the smoke test in place and give it an explicit 30 second timeout so it remains useful for validating real local data without making the full suite fail spuriously.

* chore: minimise deprecated agent packages

Replace the standalone agent package implementations with dependency-free compatibility stubs that print use npx ccusage instead and exit non-zero.

The canonical implementation now lives in apps/ccusage adapters, so the deprecated @ccusage/amp, @ccusage/codex, @ccusage/opencode, and @ccusage/pi packages no longer need local source, build, lint, test, or TypeScript configuration files.

Removing those package-local dependencies shrinks the workspace and prevents future fixes from being split between ccusage and deprecated wrapper packages.

* perf(ccusage): bound agent fallback parsing

Add a shared mapWithConcurrency helper for non-worker file parsing paths. Worker-backed adapters already use chunked worker fan-out, but Amp and pi-agent still fell back to unbounded Promise.all when workers are unavailable, including during source-runtime execution and tests.

Use the shared helper in Amp thread parsing and pi-agent JSONL parsing so the fallback path keeps stable output order without opening every file at once. Document the bounded fallback requirement in the adapter architecture notes so future adapters follow the same baseline.

Validation: pnpm run format; pnpm typecheck; pnpm --filter @ccusage/internal test src/workers.ts; pnpm --filter ccusage test src/adapter/amp/parser.ts src/adapter/pi/parser.ts; pnpm --filter ccusage run build

* test(ccusage): allow local claude smoke test to finish

The optional local-data smoke test can exceed the default Vitest timeout on real Claude projects. The test should remain enabled on developer machines when data exists, so extend its timeout instead of skipping or weakening the assertion.

Validation: pnpm --filter ccusage test src/adapter/claude/index.ts; pnpm run format; pnpm run test; pnpm typecheck; pnpm --filter ccusage run build

* refactor(ccusage): reuse bounded fallback mapper

Claude kept a local mapWithConcurrency implementation while Amp and pi-agent now use the shared worker utility. Move Claude fallback parsing to the same internal helper so all adapter fallback file parsing paths use one bounded, order-preserving primitive.

This keeps the data-loader chunk behavior aligned with the adapter architecture while preserving the worker fast path and existing output shape.

Validation: pnpm run format; pnpm --filter ccusage test src/adapter/claude/data-loader.ts src/adapter/claude/index.ts; pnpm typecheck; pnpm --filter ccusage run build; pnpm run test

* fix(internal): reject empty worker exits

Reject worker collection promises when a worker exits successfully before posting results. Without this guard, a clean exit without a message left collectIndexedFileWorkerResults unresolved and could stall adapter loading indefinitely.

Add an in-source regression test that reproduces the missing-message exit path with a mocked worker.

Validation: pnpm --filter @ccusage/internal test src/workers.ts; pnpm run format; pnpm typecheck; pnpm run test; pnpm --filter ccusage run build

* fix(ccusage): honor merged session options

Use the merged configuration values when loading Claude session rows so config-file defaults and CLI overrides flow through the same path as the other commands.

Restore the logger level unconditionally after the loading lifecycle. JSON mode and progress mode both lower the global logger level, so the command should always put it back before returning or throwing.

Validation: pnpm run format; pnpm typecheck; pnpm run test; pnpm --filter ccusage run build

* refactor(ccusage): share codex aggregation pipeline

Extract the shared Codex event grouping and pricing lookup path so the all-agent rows and Codex-specific report rows use one aggregation implementation.

Keep output-shape mapping in the public loader functions while centralizing date filtering, model grouping, fallback-model tagging, pricing lookup, and owned pricing fetcher disposal.

Validation: pnpm --filter ccusage test src/adapter/codex/index.ts; pnpm run format; pnpm typecheck; pnpm run test; pnpm --filter ccusage run build

* docs(internal): document file helper contracts

Document the exported fs and JSONL helpers that are now shared by the agent adapters. The comments explain Bun-backed reads, buffered-read null behavior, recursive search tolerance, and JSONL line callback semantics without changing runtime behavior.

Validation: pnpm --filter @ccusage/internal test src/fs.ts src/jsonl.ts; pnpm run format; pnpm typecheck; pnpm run test; pnpm --filter ccusage run build

* docs(skills): mention AI reviewers in replies

Document that replies to AI reviewer inline comments should explicitly mention the bot handle. CodeRabbit can otherwise classify a direct fix reply as human discussion and skip follow-up review.

Use @coderabbitai for CodeRabbit and the current Cubic bot handle from the PR/check UI for Cubic replies.

Validation: pnpm typecheck

* fix(ccusage): detect all agents before json output

Compute the all-agent detection list before branching on output mode so JSON and table output load the same detected agent set.

This removes the fallback to resolveAllAgents from the all command load path. JSON mode no longer silently widens the load set beyond what detection found, while the banner remains gated to table output only.

Validation: pnpm --filter ccusage test src/commands/all.ts; pnpm typecheck; pnpm --filter ccusage test test/cli-output.test.ts; pnpm run test; pnpm --filter ccusage run build.

* docs: prefer unified agent commands

Update Codex, OpenCode, and pi-agent guide examples to use the canonical ccusage agent subcommands instead of the deprecated standalone wrapper packages.

The compatibility package notes remain where useful, but normal installation, alias, and command examples now point at ccusage codex, ccusage opencode, and ccusage pi. Bun examples omit @latest because bunx can run ccusage directly.

Validation: pnpm run format; pnpm --filter docs build.

* ci(ccusage): report fixture throughput

Add fixture byte and file counts to the ccusage performance PR comment so Claude and Codex timings can be interpreted with their input sizes.

Render per-command input size plus base and PR throughput columns. This keeps absolute median timing visible while making uneven Claude and Codex fixture sizes explicit in the same table.

Validation: pnpm --filter ccusage test test/perf-scripts.test.ts; pnpm run format; pnpm typecheck; pnpm --filter ccusage run build.

* ci(ccusage): use equal perf fixture defaults

Raise the Codex large fixture default from 256 MiB to 1024 MiB so CI compares Claude and Codex against the same generated input size unless explicitly overridden.

Move the performance script assertions into in-source Vitest blocks beside the script logic. The ccusage Vitest config now includes only these performance scripts as script in-source tests, avoiding unrelated Bun-only scripts that are not currently Vite-resolvable.

Validated with pnpm --filter ccusage test scripts/compare-pr-performance.ts scripts/generate-large-fixture.ts, pnpm run format, pnpm typecheck, pnpm --filter ccusage run build, and pnpm run test.

* test(ccusage): avoid dynamic fixture imports

Use the existing top-level fs-fixture import inside Claude data-loader in-source tests instead of awaiting dynamic imports inside describe blocks.

This keeps the tests aligned with the repository in-source Vitest guidance without changing loader behavior.

Validated with pnpm --filter ccusage test src/adapter/claude/data-loader.ts, pnpm run format, pnpm typecheck, and pnpm --filter ccusage run build.

* perf(ccusage): share JSONL marker scanning

Move the Claude byte-buffered JSONL marker scan into @ccusage/internal and make Claude, Codex, and pi-agent use the same primitive. This keeps JSONL adapters from decoding unrelated log lines while preserving the existing stream fallback for oversized files.

Codex also now sends worker events through typed-array transfer payloads instead of cloning large object arrays. The payload keeps timestamps, sessions, model indexes, numeric token fields, and fallback-model flags separate so worker transport matches the optimized Claude pattern more closely.

Validated with focused adapter tests, full format/typecheck/test, and JSON parity checks against HEAD for Claude, Codex, and pi-agent outputs.

* docs(ccusage): define shared adapter optimization baseline

Document that matching Claude means reusing the byte marker scanner for JSONL adapters and avoiding large worker object-array payloads when compact transfer payloads or worker-side aggregation are practical.

Also clarify that whole-JSON and SQLite adapters do not benefit from JSONL marker scanning directly, but should still share worker gating, result ordering, pricing lifecycle, output formatting, and compact payload helpers where their source shape allows it.

* refactor(ccusage): share adapter worker primitives

Move remaining Claude-local JSONL worker helpers onto the shared internal worker and JSONL primitives so the adapter no longer carries duplicate file-size chunking, worker response, or line-reader implementations.

Extend defineAgentLogLoader with one-time prepared state and async usage extraction. Amp and OpenCode now use the same row aggregation foundation as pi while keeping pricing fetchers scoped once per command and only calculating cost after date filtering.

Add shared indexed worker data and pricing context helpers so Codex, Amp, OpenCode, and pi workers use the same message/data shapes.

* docs(agents): add deprecated package readmes

Restore minimal README files for the deprecated standalone agent packages. Keep package-specific badges visible while replacing the old package documentation with a clear deprecation notice and the unified bunx ccusage command to use instead.

* fix(internal): preserve worker count os mocking

Use the node:os module object inside shared worker utilities so tests that spy on availableParallelism continue to affect Claude worker-count expectations after the worker-count helper moved into @ccusage/internal.

This keeps the refactor behavior deterministic across CI CPU counts without changing production worker-count logic.

* fix(ccusage): address adapter review edge cases

Fix timezone date/month cache keys so non-hour timezone boundaries cannot reuse a UTC-hour cache entry.

Decode buffered marker JSONL lines as UTF-8 and report marker indexes against the decoded string, preserving non-ASCII log content.

* fix(ccusage): harden adapter date keys

Reject impossible calendar date filters before they reach range filtering so invalid CLI input cannot silently produce wrong reports.

Build daily and monthly adapter keys from Intl formatToParts instead of parsing locale display strings. This keeps the existing timezone-aware Intl behavior while making machine-readable grouping keys independent of runtime date formatting patterns.

* perf(ccusage): keep adapter date key fast path

Avoid forcing formatToParts through the normal adapter date and month key path. The hot path now keeps the existing machine-key format result when Intl already returns YYYY-MM-DD or YYYY-MM, while retaining the formatToParts fallback for runtimes with different locale output.

UTC ISO timestamps also use direct string slices, which keeps the CI benchmark and default log path away from Intl allocation in tight loops. This preserves the review portability fix without penalizing Claude and Codex loader throughput.

* perf(ccusage): restore agent JSONL hot paths

Keep the shared JSONL marker helper while restoring the agent-specific hot paths that were lost during the scanner consolidation.

Claude now uses the shared helper single-marker byte scanner with latin1 decoding and byte marker indexes, matching the previous optimized loader behavior. Codex uses line scanning for marker-dense logs instead of the generic multi-marker candidate map.

Adapter date keys now keep the default timezone as the local system timezone without constructing Intl formatters on the hot path. Explicit --timezone values still use Intl when needed, while UTC uses direct UTC formatting.

Validation: pnpm run format; pnpm typecheck; pnpm run test; pnpm --filter ccusage build. Local parity against main through 2026-05-14 matched for claude/codex JSON in both TZ=UTC and default local timezone.

* perf(internal): preserve synchronous JSONL marker scans

Add an explicit synchronous callback mode to the shared JSONL marker helper so agent loaders that do purely synchronous parsing do not pay per-row Promise handling overhead.

Keep markerIndex byte semantics consistent for line-scan and text fallback paths by converting decoded-string indexes back to byte offsets with Buffer.byteLength. This addresses the latest CodeRabbit JSONL marker-index comment without failing large-file fallback paths.

Claude and Codex opt into the synchronous mode for their hot JSONL parser callbacks. Validation: pnpm run format; pnpm typecheck; pnpm --filter @ccusage/internal test src/jsonl.ts --run; pnpm --filter ccusage test src/adapter/codex/parser.ts --run; pnpm --filter ccusage build; JSON parity against main through 2026-05-14 for claude/codex with TZ=UTC.

* docs(internal): document JSONL marker scanning contract

Document the marker-processing options used by the shared JSONL scanner so callers can see the byte versus decoded index guarantees without reading the implementation.

This addresses the CodeRabbit documentation nit without changing scanner behavior or benchmark-sensitive code paths.
ryoppippi added a commit to LeslieLeung/ccusage that referenced this pull request May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants