Skip to content

Stop runaway emergency compaction when compacted_count drifts past history#193

Open
cooleryu wants to merge 1 commit into
1jehuang:masterfrom
cooleryu:fix-compaction-overflow-175
Open

Stop runaway emergency compaction when compacted_count drifts past history#193
cooleryu wants to merge 1 commit into
1jehuang:masterfrom
cooleryu:fix-compaction-overflow-175

Conversation

@cooleryu
Copy link
Copy Markdown

@cooleryu cooleryu commented May 11, 2026

Summary

Fixes #175 by making stale compaction state recover instead of re-entering the emergency compaction loop.

This PR:

Motivation

The issue report shows a long-running coding-agent session getting wedged after emergency compaction. Once compacted_count drifted past the actual message history length, active_messages() returned the full transcript again. That made the next API payload include both the summary and old messages, kept the context above the threshold, and allowed hard compaction to append repeated [Emergency compaction] markers while increasing compacted_count even further.

The important invariant is: if the caller provides the full message list, a compacted count beyond that list cannot mean "all messages are active again". It means the manager has stale state and should recover to an empty active tail.

Changes

  • Treat stale compacted_count as a recoverable state and clamp it to messages.len().
  • Reset stale active-message accounting when the clamp is applied.
  • Preserve the backward-compatible no-history path used by check_and_apply_compaction().
  • Prevent hard/background compaction from advancing compacted_count beyond caller history.
  • Add restore-time warning logs for persisted stale compaction state.
  • Add regression coverage for:
    • API message assembly not replaying the full transcript after a summary
    • active_messages() clamping stale state to an empty active tail
    • token estimation ignoring stale cached active char counts
    • hard compaction not inflating stale counts or appending another emergency block

Test Plan

  • cargo test test_bug_175_ -- --nocapture
    • 4 passed
  • cargo test compaction::tests:: -- --nocapture
    • 30 passed
  • cargo fmt --all -- --check
    • passed
  • cargo check --all-targets --all-features
    • passed

I also installed and tried local clippy. On my macOS machine it stops on pre-existing macOS-only lints in crates/jcode-core/src/stdin_detect.rs; I did not mix that unrelated cleanup into this PR.

Risk

Low to moderate. This changes recovery behavior only when compacted_count is already inconsistent with the caller-provided history. Normal compaction paths should continue to use the same active suffix. The no-history compatibility path is preserved so existing callers of check_and_apply_compaction() still apply completed background compactions.

Maintainer Context

The issue already included a useful local hotfix note. This PR turns that direction into a tested upstream fix and keeps the scope limited to the compaction invariant, recovery points, and regression coverage. It intentionally does not add a session-file migration script; that can remain a separate operational recovery step if maintainers want it.


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

@cooleryu cooleryu marked this pull request as ready for review May 12, 2026 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant