docs(plugin): clarify cache segment semantics as last-call-only token display#1369
Conversation
… display Document why the status-bar cache segment renders raw tokens (♻N/M) instead of a percentage, explaining the last-call semantics of context_window.current_usage from Claude Code stdin. status-bar-model.md: - Update the example status line and segment table to show the new ♻2k/3.5k format instead of the deprecated Cache:XX% - Add a new "Cache Segment Semantics (Last-Call Only)" subsection explaining numerator/denominator, format rules, fallback behavior, and a contributor caution against reintroducing percentage rendering - Refresh all 5 Mode Examples (PLAN/ACT/EVAL/AUTO/Ready) to match the current v5.3.0 output - Add an explicit note on the Ready state that the cache segment is hidden when no API calls have been made yet (documented fallback, not a bug) codingbuddy-hud.py: - Expand format_compact_tokens docstring with explicit output rules, the 1000-1049 rounding note, and the error fallback contract (returns "0" when input is not coercible to int — never raises) - Expand format_cache_segment docstring with the last-call rationale, fallback conditions, and a contributor caution mirrored in the docs This PR is docs-only — no behavior changes. Regression tests from PR #1359 continue to guard against Cache:XX% reintroduction. Closes #1357
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
JeremyDev87
left a comment
There was a problem hiding this comment.
EVAL Mode Review — PR #1369
CI Status
PASS — 28/28 jobs green (build, lint, typecheck, tests, security, circular, e2e-plugin-docker, e2e-plugin-hooks 3.11/3.12, rules-validation, plugin-validate-commands, landing checks)
Local Verification
yarn workspace codingbuddy-claude-plugin lint— cleanyarn workspace codingbuddy-claude-plugin format:check— Prettier cleanyarn workspace codingbuddy-claude-plugin typecheck— cleanpython3 -m pytest packages/claude-code-plugin/tests/test_hud.py -v— 105/105 passyarn dlx markdownlint-cli2@0.20.0 packages/claude-code-plugin/docs/status-bar-model.md— 0 errors
Severity Summary
- Critical: 0
- High: 0
- Medium: 0
- Low: 1
Findings
Critical (0)
None.
High (0)
None.
Medium (0)
None.
Low (1)
L1. format_compact_tokens docstring — ambiguous range notation
hooks/codingbuddy-hud.py lines 146-148:
Note: values in (1000, 1049) may round to
1.0kvia.1fformatting.
Mathematical interval notation (1000, 1049) is an open interval, which reads as 1001–1048. The actual behavior (empirically verified against the current code) is:
- 1000 →
"1k"(whole-k branch) - 1001–1049 →
"1.0k"(.1fformatting rounds down) - 1050 →
"1.1k"
So the true "rounds to 1.0k" range is [1001, 1049] inclusive. Consider rewriting as "values 1001–1049" or "values in [1001, 1049]" to remove the ambiguity. Purely a clarity nit — no downstream impact.
Accuracy Verification (code ↔ docs)
All claims in the new "Cache Segment Semantics" section were cross-referenced against format_cache_segment() and format_compact_tokens() in hooks/codingbuddy-hud.py:
| Docs claim | Code reality | ✓ |
|---|---|---|
Numerator = cache_read_input_tokens |
cache_read from usage.get("cache_read_input_tokens", 0) or 0 |
✓ |
Denominator = input + cache_creation + cache_read |
total = input_tokens + cache_write + cache_read |
✓ |
< 1000 → raw integer (532) |
if value < 1000: return str(value) → 532 |
✓ |
Whole-k → Nk (1k, 128k) |
if k == int(k): return f"{int(k)}k" → 1k, 128k |
✓ |
Non-whole-k → N.Nk (1.5k) |
return f"{k:.1f}k" → 1.5k |
✓ |
context_window missing → segment omitted |
usage = context_window.get("current_usage") if context_window else None; if not usage: return "" |
✓ |
current_usage null/missing → segment omitted |
Same falsy-check path (also handles empty dict) | ✓ |
| All three counts 0 → segment omitted | if total == 0: return "" |
✓ |
None coerced to 0 via or 0 |
.get("input_tokens", 0) or 0 (×3 fields) |
✓ |
Example Verification (mode examples)
Every mode example in the Mode Examples section was replayed through format_compact_tokens:
| Example | Reconstructed call | Output | Match |
|---|---|---|---|
PLAN ♻1.2k/3k |
cache_read=1200, total=3000 |
♻1.2k/3k |
✓ |
ACT ♻8k/13k |
cache_read=8000, total=13000 |
♻8k/13k |
✓ |
EVAL ♻24k/34k |
cache_read=24000, total=34000 |
♻24k/34k |
✓ |
AUTO ♻95k/172k |
cache_read=95000, total=172000 |
♻95k/172k |
✓ |
| Ready — cache segment hidden | no current_usage in initial state → "" |
hidden | ✓ |
Format block ♻800/1.5k |
cache_read=800, total=1500 |
♻800/1.5k |
✓ |
The explicit note on the Ready state ("cache segment is omitted in the initial state because no API calls have been made yet — documented fallback, not a bug") correctly communicates the design to future maintainers.
Regression Test References
Both test paths cited by the docs exist at the expected locations:
tests/test_hud.py::TestFormatCacheSegment::test_regression_no_percent_in_output→ line 161tests/test_hud.py::TestFormatStatusLineCacheSegment::test_status_line_no_longer_contains_cache_percent→ line 177
Both actively guard against % / Cache: in the rendered output.
Docstring ↔ Docs Alignment
The expanded docstrings in format_cache_segment and format_compact_tokens mirror the docs section without contradictions:
- Last-call semantics: both phrasings point at the same root cause (
context_window.current_usage= most recent API call) - Fallback list: docstring covers the same three paths as the docs (
context_windowfalsy,current_usagemissing/null,total == 0) #1355/#1356references are consistent across both- "Contributor caution" wording is echoed from docstring to docs
Spec Compliance (#1357)
- Docs clearly state the cache segment is based on the most recent API call — "the most recent API call, not cumulative session cache efficiency" (docs) + "
context_window.current_usagefrom Claude Code stdin reflects only the most recent API call" (docstring) - Docs explain numerator and denominator — explicit formulas in both the docs section and
format_cache_segmentdocstring - Future maintainers can understand the rationale — dedicated "Why not a percentage?" paragraph plus Contributor caution plus regression-test pointer
- Code comments near helpers explain semantics — both helpers have multi-paragraph docstrings covering contract, format, fallback, and rationale
All four acceptance criteria from #1357 are satisfied.
Regression Risk
None detected. This is a pure docs + docstring change:
format_cache_segmentfunction body is unchanged — diff is docstring-onlyformat_compact_tokensfunction body is unchanged — diff is docstring-onlystatus-bar-model.mdis documentation only- Plugin test suite (105/105) and full CI (28/28) remain green
The regression guards added in #1359 (test_regression_no_percent_in_output, test_status_line_no_longer_contains_cache_percent) continue to prevent Cache:XX% from being reintroduced.
Wave Context
This PR completes Wave 2 of the #1356 cache display refactor family. With #1359 (Wave 1) already merged and this PR approved, #1356 should auto-close via Closes #1357 on merge.
Recommendation
APPROVE (same-author PR → posted as comment; cannot --approve)
Reasoning: All four acceptance criteria for #1357 are met with high quality. Docs and docstrings are accurate and internally consistent, every mode example was verified against the real code, all four documented fallback paths are correct, and both regression test references resolve. The single Low finding is a docstring range-notation nit with no functional impact and can be addressed in a future docs touch-up or left as-is. CI is fully green and local verification (lint/format/typecheck/pytest/markdownlint) is clean across the board.
Review Cycle Complete — APPROVED ✅Review panel summary (from EVAL mode reviewer in dedicated panel, see previous review comment):
Loop termination condition met: Critical = 0 AND High = 0. All four acceptance criteria for #1357 verified as satisfied. Every mode example was replayed through the actual code, all fallback paths confirmed, regression test references resolved, and docstring ↔ docs alignment verified. CI is green (28/28), local verification (lint/format/typecheck/pytest 105/105/markdownlint) all clean. Ready for user to merge. Closing the review panel. |
Summary
Documents why the status-bar cache segment renders raw tokens (`♻N/M`) instead of a percentage. Closes out the #1356 follow-up family by codifying the last-call semantics in both user-facing docs and contributor-facing docstrings.
This is a docs-only PR — no behavior changes, no new tests. The regression guards introduced in #1359 continue to prevent reintroducing `Cache:XX%`.
Changes
`packages/claude-code-plugin/docs/status-bar-model.md`
`packages/claude-code-plugin/hooks/codingbuddy-hud.py`
Acceptance Criteria (#1357)
Test plan
Wave context
This PR closes out the Wave 2 of the #1356 cache display refactor:
Closes #1357