Skip to content

fix(plugin): replace misleading Cache:% status-bar metric with raw cache token display#1359

Merged
JeremyDev87 merged 1 commit into
masterfrom
fix/cache-display-raw-tokens-1355-1354
Apr 5, 2026
Merged

fix(plugin): replace misleading Cache:% status-bar metric with raw cache token display#1359
JeremyDev87 merged 1 commit into
masterfrom
fix/cache-display-raw-tokens-1355-1354

Conversation

@JeremyDev87

Copy link
Copy Markdown
Owner

Summary

The status-bar Cache:XX% segment derived from context_window.current_usage only reflects the most recent API call, not session-wide cache efficiency. Users frequently misread it as a cumulative cache hit rate (e.g. seeing Cache:100% and assuming the whole session is fully cached).

This PR replaces the misleading percentage with a raw token display (♻2k/3.5k) and adds regression coverage to prevent reverting to %-based rendering.

Changes

Implementation (#1355)

  • Remove compute_cache_hit_rate() — the % calculation was mathematically correct but semantically misleading
  • Add format_cache_segment(ctx_window) — renders ♻{cache_read}/{total} with last-call semantics
    • Numerator: cache_read_input_tokens
    • Denominator: input_tokens + cache_creation_input_tokens + cache_read_input_tokens
  • Add format_compact_tokens(n) helper — 532532, 10001k, 15001.5k, 128000128k
  • Update format_status_line() — omit the cache slot entirely when usage data is missing, so the status line still renders cleanly

Regression tests (#1354)

  • TestFormatCacheSegment (7 tests): empty ctx, null usage, input-only, partial read, full read, large k values, explicit % regression guard
  • TestFormatStatusLineCacheSegment (3 tests): final status-line output locks in the new contract and explicitly asserts Cache: never appears

Output comparison

Before: ◕‿◕ CB v5.3.0 | PLAN 🟢 | 12m | ~$0.42 | Cache:53% | Ctx:45% | Opus
After: ◕‿◕ CB v5.3.0 | PLAN 🟢 | 12m | ~$0.42 | ♻800/1.5k | Ctx:45% | Opus

Test plan

  • python3 -m pytest tests/test_hud.py — 105/105 pass (was 99, net +6 after replacing 4 percentage tests with 10 raw-token tests)
  • python3 -m pytest tests/ — 748 pass (full plugin test suite)
  • python3 -m pytest hooks/tests/ — 239 pass
  • yarn workspace codingbuddy-claude-plugin lint — clean
  • yarn workspace codingbuddy-claude-plugin format:check — clean
  • yarn workspace codingbuddy-claude-plugin typecheck — clean
  • yarn workspace codingbuddy-claude-plugin test:coverage — 123/123 pass, 100% stmts
  • yarn workspace codingbuddy-claude-plugin circular — no cycles
  • yarn workspace codingbuddy-claude-plugin build — success
  • Security audit (all 3 workspaces) — no high-severity findings

TDD pair note

#1354 (test) and #1355 (fix) are a TDD RED-GREEN pair. They touch different files but are semantically inseparable — merging either in isolation breaks CI. This PR lands them together as a single atomic change.

Follow-up

Closes #1355
Closes #1354

…che token display

The Cache:XX% segment derived from context_window.current_usage only reflects
the most recent API call, not session-wide cache efficiency. Users frequently
misread it as cumulative cache hit rate.

Replace compute_cache_hit_rate() with format_cache_segment() that renders raw
token values (e.g. ♻2k/3.5k) with the following semantics:
- numerator = cache_read_input_tokens
- denominator = input_tokens + cache_creation_input_tokens + cache_read_input_tokens
- values represent the latest API call, not session totals

Also add format_compact_tokens() helper for k-suffix compact rendering
(532 → 532, 1000 → 1k, 1500 → 1.5k, 128000 → 128k).

Safe fallback: when current_usage is missing/null/zero, the cache segment
is omitted entirely so the status line still renders without a broken slot.

Test coverage (#1354):
- format_cache_segment: 7 cases covering empty, null, input-only, partial,
  full, large-value k-format, and no-percent regression
- format_status_line integration: 3 cases locking in the new output contract
  and guarding against Cache:% regression

Closes #1355
Closes #1354
@JeremyDev87 JeremyDev87 added fix priority:medium Medium priority plugin packages/claude-code-plugin labels Apr 5, 2026
@vercel

vercel Bot commented Apr 5, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
codingbuddy-landing Ready Ready Preview, Comment Apr 5, 2026 8:51am

@JeremyDev87

Copy link
Copy Markdown
Owner Author

Conductor Self-Review (Self-Approve Unavailable)

Since GitHub blocks approving one's own PR, posting this as a review comment. CI is green and acceptance criteria are satisfied — ready to merge.

CI Status

All 20 checks green — lint, format, typecheck, test, circular, build, security, validate-commands, rules-validation.

Strengths

Devil's Advocate

  • `cache_read > total` — impossible (total includes cache_read). OK.
  • All values `None` → `None or 0 = 0` → empty string. OK.
  • `usage` as list — Claude Code stdin guarantees dict. Low risk.

Minor follow-up suggestions (non-blocking)

These are nice-to-haves and can be addressed later if desired:

  1. Lock in the `♻` prefix explicitly. Current tests assert presence of token values but not the recycle symbol itself. If someone changes `♻` to another character, tests would still pass. Consider adding `assert "\u267b" in result` to one of the tests.
  2. Rounding boundary coverage. `1549 → 1.5k` vs `1550 → 1.6k` boundaries aren't covered. Minor.
  3. Tighten `test_partial_cache_read` assertion. Currently `"1500" in result or "1.5k" in result` — since the actual output is `♻800/1.5k`, the `or` branch is too permissive. Could assert `"800/1.5k"` for the exact contract.

None of these block the merge. Ready to go in.

Closes #1355, closes #1354.

@JeremyDev87 JeremyDev87 self-assigned this Apr 5, 2026

@JeremyDev87 JeremyDev87 left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EVAL Mode Review — PR #1359

CI Status

PASS — 29/29 jobs green (lint, format, typecheck, test:coverage, circular, build, security audit 전체 통과)

로컬 재검증:

  • yarn workspace codingbuddy-claude-plugin lint
  • yarn workspace codingbuddy-claude-plugin format:check
  • yarn workspace codingbuddy-claude-plugin typecheck
  • python3 -m pytest packages/claude-code-plugin/tests/test_hud.py -v105/105 pass

Severity Summary

  • Critical: 0
  • High: 0
  • Medium: 2
  • Low: 3

Findings

Critical (0)

없음.

High (0)

없음.

Medium (2)

M1. format_compact_tokens 전용 단위 테스트 누락

  • 현재 format_compact_tokensformat_cache_segment를 통해서만 간접 검증됨 (test_hud.pyTestFormatCompactTokens 클래스 없음)
  • 누락된 경계 케이스: 999 (< 1000 경로), 1000 (정확 경계), 1001 (trim 로직 엣지), None/잘못된 타입 (try/except 경로)
  • 이 함수는 status-bar 외에도 재사용 가능한 퍼블릭 헬퍼이므로 독립 테스트로 계약을 고정하는 것을 권장
  • 제안: TestFormatCompactTokens 클래스 추가 (6-8 케이스) — follow-up PR 또는 #1357 에 편승 가능

M2. format_compact_tokens 포맷 일관성 미세 이슈

  • 1000 → "1k" (정수 경로, trim 적용)
  • 1001 → "1.0k" (k = 1.001, k != int(k)f"{1.001:.1f}k""1.0k")
  • docstring은 "trimmed of trailing .0"이라고 명시하지만 1001 ~ 1049 구간은 1.0k로 표시되어 시각적으로 "trim" 되지 않은 것처럼 보임 (실제로는 .1f 포맷의 결과)
  • 기능적 결함은 아니나 문서와 동작이 미세하게 어긋남 — 테스트로 실제 동작을 고정하거나 docstring을 "near-thousand values may display as 1.0k"로 명확화 권장

Low (3)

L1. format_compact_tokens 에러 폴백 미문서화

  • try/except (TypeError, ValueError)'0' 반환 경로가 docstring에 언급되지 않음
  • 한 줄 추가 권장: "Returns '0' when input is not a valid integer."

L2. 유니코드 글리프 인라인 사용

  • \u267b (♻) 가 format_cache_segment 내부에 직접 삽입됨
  • 파일 상단 상수(CACHE_RECYCLE_GLYPH = \"\\u267b\")로 추출하면 의도가 명확해지고 향후 테마/커스터마이즈 지점이 생김
  • 순수 nit — 현재도 동작에 문제 없음

L3. status-bar-model.md 예제 7개 잔존

  • packages/claude-code-plugin/docs/status-bar-model.mdCache:XX% 예제 7곳 남아 있음 (라인 23, 34, 166, 172, 179, 186, 193)
  • PR body에 명시적으로 #1357 Wave 2로 위임됨 (파일 겹침 회피 목적)
  • 본 PR의 블로커 아님 — 트래킹용 참고

Spec Compliance

#1355 acceptance criteria — ✓ 전부 충족

  • Cache:XX% 제거 확인 (test_status_line_no_longer_contains_cache_percent)
  • cache_read_input_tokens 분자 / input + cache_create + cache_read 분모 계약 구현
  • status-bar용 compact 포맷 (♻2k/3.5k)
  • current_usage 없음/null → 세그먼트 완전 생략 (테스트에 명시)
  • 작은 값/큰 값 안정적 포맷 (532, 1k, 1.5k, 128k)

#1354 acceptance criteria — ✓ 전부 충족 (7/7 케이스)

  1. test_no_context_window
  2. test_null_current_usage
  3. test_input_tokens_only_no_cache_read
  4. test_partial_cache_read
  5. test_full_cache_read_shows_raw_not_100pct
  6. test_large_values_use_k_format
  7. test_status_line_contains_raw_cache_tokens + test_regression_no_percent_in_output

#1356 acceptance criteria — ✓ 코드 범위 충족 (docs 범위는 #1357 위임)

  • 상태바에서 Cache:XX% 제거 ✓
  • raw token semantics ✓
  • 회귀 테스트로 % 복귀 방지 ✓ (test_regression_no_percent_in_output, test_status_line_no_longer_contains_cache_percent)
  • docstring으로 last-call semantics 명확화 ✓ (format_cache_segment 9줄 docstring)
  • ⚠️ status-bar-model.md 예제 업데이트는 #1357 follow-up 위임

Additional Verification

Backward compatibility — ✓ 안전

  • compute_cache_hit_rate 전역 grep 결과: 코드 내 참조 0건
  • 유일한 잔존 참조는 docs/plans/2026-03-28-wave2-statusline-mode-detect.md (과거 계획 문서, 변경 불필요)

Security — 관련 위험 없음 (read-only display helper, 외부 입력 직접 처리 없음)

Performance — 관련 위험 없음 (status-bar 업데이트당 1회 호출, O(1))

Recommendation

APPROVE (with follow-up suggestions)

Reasoning:

  • 0 Critical + 0 High → 머지 차단 요소 없음
  • #1355/#1354 수락 기준을 전부 코드와 테스트로 충족
  • compute_cache_hit_rate 제거는 안전 (외부 caller 없음)
  • TDD RED-GREEN pair로 잘 묶여 있으며 atomic merge 필요성이 PR body에 명시됨
  • CI 전체 통과 + 로컬 재검증 전부 통과 (105/105 test, lint/format/typecheck clean)
  • Medium/Low 항목은 모두 품질 개선 제안이며 본 PR 범위를 벗어나는 follow-up으로 처리 가능

동일 작성자라 --approve는 불가하여 --comment로 승인 의사 표명합니다. 머지 진행 권장.


Reviewed by code-reviewer (EVAL mode) via codingbuddy parse_mode

@JeremyDev87

Copy link
Copy Markdown
Owner Author

Review Cycle Complete — APPROVED ✅

Review panel summary (from EVAL mode reviewer, see previous review comment):

Loop termination condition met: Critical = 0 AND High = 0.

CI is green (29/29), local verification passes (105/105 python tests, lint/format/typecheck clean), and all acceptance criteria for #1355 and #1354 are satisfied. The Medium/Low findings from the reviewer are tracked as follow-up suggestions and do not block merge.

Ready for user to merge. Closing the review panel.

@JeremyDev87 JeremyDev87 merged commit 8b32125 into master Apr 5, 2026
29 checks passed
@JeremyDev87 JeremyDev87 deleted the fix/cache-display-raw-tokens-1355-1354 branch April 5, 2026 09:18
JeremyDev87 added a commit that referenced this pull request Apr 5, 2026
… display

Document why the status-bar cache segment renders raw tokens (♻N/M)
instead of a percentage, explaining the last-call semantics of
context_window.current_usage from Claude Code stdin.

status-bar-model.md:
- Update the example status line and segment table to show the new
  ♻2k/3.5k format instead of the deprecated Cache:XX%
- Add a new "Cache Segment Semantics (Last-Call Only)" subsection
  explaining numerator/denominator, format rules, fallback behavior,
  and a contributor caution against reintroducing percentage rendering
- Refresh all 5 Mode Examples (PLAN/ACT/EVAL/AUTO/Ready) to match
  the current v5.3.0 output
- Add an explicit note on the Ready state that the cache segment is
  hidden when no API calls have been made yet (documented fallback,
  not a bug)

codingbuddy-hud.py:
- Expand format_compact_tokens docstring with explicit output rules,
  the 1000-1049 rounding note, and the error fallback contract
  (returns "0" when input is not coercible to int — never raises)
- Expand format_cache_segment docstring with the last-call rationale,
  fallback conditions, and a contributor caution mirrored in the docs

This PR is docs-only — no behavior changes. Regression tests from
PR #1359 continue to guard against Cache:XX% reintroduction.

Closes #1357
JeremyDev87 added a commit that referenced this pull request Apr 5, 2026
… display

Document why the status-bar cache segment renders raw tokens (♻N/M)
instead of a percentage, explaining the last-call semantics of
context_window.current_usage from Claude Code stdin.

status-bar-model.md:
- Update the example status line and segment table to show the new
  ♻2k/3.5k format instead of the deprecated Cache:XX%
- Add a new "Cache Segment Semantics (Last-Call Only)" subsection
  explaining numerator/denominator, format rules, fallback behavior,
  and a contributor caution against reintroducing percentage rendering
- Refresh all 5 Mode Examples (PLAN/ACT/EVAL/AUTO/Ready) to match
  the current v5.3.0 output
- Add an explicit note on the Ready state that the cache segment is
  hidden when no API calls have been made yet (documented fallback,
  not a bug)

codingbuddy-hud.py:
- Expand format_compact_tokens docstring with explicit output rules,
  the 1000-1049 rounding note, and the error fallback contract
  (returns "0" when input is not coercible to int — never raises)
- Expand format_cache_segment docstring with the last-call rationale,
  fallback conditions, and a contributor caution mirrored in the docs

This PR is docs-only — no behavior changes. Regression tests from
PR #1359 continue to guard against Cache:XX% reintroduction.

Closes #1357
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix plugin packages/claude-code-plugin priority:medium Medium priority

Projects

None yet

1 participant