Skip to content

Add DeepSeek prefix-cache stability mode#194

Open
sternelee wants to merge 1 commit into
1jehuang:masterfrom
sternelee:feat/DeepSeek-Reasonix
Open

Add DeepSeek prefix-cache stability mode#194
sternelee wants to merge 1 commit into
1jehuang:masterfrom
sternelee:feat/DeepSeek-Reasonix

Conversation

@sternelee
Copy link
Copy Markdown

@sternelee sternelee commented May 11, 2026

Adapt Reasonix Pillar 1 design for jcode's profile=deepseek path:

  • New prefix_cache_stable module with preflight checks, turn-end tool result truncation (3k token cap), and post-usage fold decisions at 50%/70%/80% thresholds.

  • CacheTracker gains cache-hit-rate tracking (record_usage, cache_hit_summary) so sessions can monitor prefix-cache effectiveness.

  • Agent messages_for_provider truncates oversized tool results when deepseek profile is active, without touching persisted session history.

  • Turn loop emits preflight warnings before API calls and periodic cache-hit summaries every 5 turns.


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

Adapt Reasonix Pillar 1 design for jcode's profile=deepseek path:

- New prefix_cache_stable module with preflight checks, turn-end tool
  result truncation (3k token cap), and post-usage fold decisions at
  50%/70%/80% thresholds.

- CacheTracker gains cache-hit-rate tracking (record_usage,
  cache_hit_summary) so sessions can monitor prefix-cache effectiveness.

- Agent messages_for_provider truncates oversized tool results when
  deepseek profile is active, without touching persisted session history.

- Turn loop emits preflight warnings before API calls and periodic
  cache-hit summaries every 5 turns.
Copilot AI review requested due to automatic review settings May 11, 2026 13:48
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a DeepSeek-specific “prefix-cache stability mode” intended to improve DeepSeek automatic prefix-cache hit rates by preflighting near-limit requests, truncating large tool results for provider requests (without mutating persisted history), and tracking provider-reported cache hit rates.

Changes:

  • Introduces prefix_cache_stable module with DeepSeek activation detection, request token estimation + preflight warnings, and tool-result truncation for API-bound messages.
  • Extends CacheTracker with cumulative cache hit/miss tracking and a hit-rate summary string.
  • Hooks the new mode into the agent turn loop (preflight warnings + periodic cache-hit summaries) and message preparation (tool-result truncation).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/prefix_cache_stable.rs New DeepSeek stability utilities: preflight estimation, post-usage decisions, tool-result truncation, and tests.
src/lib.rs Exposes the new prefix_cache_stable module.
src/cache_tracker.rs Adds cache hit/miss accounting and summary helpers to track prefix-cache effectiveness.
src/agent/turn_loops.rs Emits DeepSeek preflight warnings and periodic cache-hit summaries; records cache usage per turn.
src/agent.rs Truncates oversized tool results in messages_for_provider() when DeepSeek stability mode is active.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/agent.rs
Comment on lines +527 to +535
let (truncated, truncate_count) =
prefix_cache_stable::truncate_tool_results_for_api(&messages);
if truncate_count > 0 {
logging::info(&format!(
"Prefix-cache mode: truncated {} tool results for API",
truncate_count
));
}
truncated
Comment thread src/agent.rs
Comment on lines +560 to +568
let messages = if prefix_cache_stable::is_prefix_cache_stable_mode() {
let (truncated, truncate_count) =
prefix_cache_stable::truncate_tool_results_for_api(&messages);
if truncate_count > 0 {
logging::info(&format!(
"Prefix-cache mode: truncated {} tool results for API (session path)",
truncate_count
));
}
Comment thread src/agent/turn_loops.rs
Comment on lines +529 to +534
// Record cache usage for prefix-cache hit-rate tracking
self.cache_tracker.record_usage(usage_cache_read, usage_input.unwrap_or(0));
if prefix_cache_stable::is_prefix_cache_stable_mode()
&& self.cache_tracker.usage_turn_count() % 5 == 0
{
logging::info(&format!("Prefix-cache stats: {}", self.cache_tracker.cache_hit_summary()));
Comment on lines +35 to +36
/// Max chars for a tool result after turn-end truncation (chars / 4 heuristic)
const TURN_END_RESULT_CAP_CHARS: usize = TURN_END_RESULT_CAP_TOKENS * 4;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants