fix(memory): close triage gaps and add proof tests#579
Conversation
WalkthroughRefactors memory-persistence trigger logic; adds working-memory sanitization (redaction, encoded-secret fail-closed, truncation) and applies it to worker spawn, cron errors, and task messages; changes Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR improves the safety and correctness of the “working memory” + knowledge synthesis subsystems by ensuring sensitive/user-provided strings aren’t persisted verbatim, tightening regeneration/persistence triggering logic, and adding targeted unit tests to prevent regressions.
Changes:
- Redacts + bounds worker task text and cron error summaries before persisting them into working memory.
- Refactors memory persistence trigger selection and knowledge synthesis regeneration debounce/version logic into testable pure helpers.
- Expands unit test coverage across working memory rendering, task update summaries, compactor invariants, and cron error persistence.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/tools/task_update.rs | Adds task update delta summaries for working memory events + tests for summary behavior and done/outcome behavior. |
| src/tools/spawn_worker.rs | Sanitizes worker task text before emitting “worker spawned (cortex)” working-memory events. |
| src/tasks/store.rs | Extends task update result to include a “previous task” snapshot and adds tests validating snapshot pairing. |
| src/secrets/scrub.rs | Introduces scrub_working_memory_text with secret/leak scrubbing + encoded-leak fail-closed behavior + truncation and tests. |
| src/memory/working.rs | Adds token-budget enforcement when rendering the channel activity map and adds coverage for new rendering behaviors. |
| src/cron/scheduler.rs | Scrubs/bounds cron error summaries before writing them to working memory + tests covering redaction/truncation. |
| src/api/channels.rs | Stops swallowing errors in inspect_prompt; logs + propagates failures as 500s instead. |
| src/agent/cortex.rs | Extracts knowledge synthesis regeneration decision logic into a pure helper and adds debounce/version tests. |
| src/agent/compactor.rs | Adds tests ensuring compactor stays single-turn and toolless; tests summary header stripping. |
| src/agent/channel_dispatch.rs | Scrubs/bounds worker spawn task text persisted to working memory (builtin + opencode paths) + tests. |
| src/agent/channel.rs | Refactors memory persistence trigger selection into a helper and adds unit tests for trigger priority rules. |
| docs/design-docs/working-memory-triage.md | Marks triage items as resolved and documents the fixes shipped in this PR. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
src/agent/compactor.rs (1)
457-474: Source-scan guard is reasonable, but slightly fragile.Scanning the file's own source with
include_str!to assert invariants works, but it has a couple of brittle edges worth noting:
source.find("#[cfg(test)]")assumes the tests module is the only/first such attribute in the file. Any future#[cfg(test)]helper placed above the tests module would silently shrink the scanned region and could hide a regression.- The assertion targets exact substrings (
.tool_server_handle(,.default_max_turns(1)). A harmless formatting change (e.g., line break between.default_max_turnsand(1), or a constant likeconst MAX_TURNS: usize = 1;) would fail the test without an actual behavior change.A more robust alternative would be to factor the agent construction into a small helper returning the builder (or its config) and assert on that directly, but given this is an intentional belt-and-suspenders guard, the current form is acceptable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/compactor.rs` around lines 457 - 474, The current test run_compaction_path_remains_toolless_and_one_turn brittlely scans the file source with include_str! and substring searches; instead extract the compactor/summary builder creation into a small helper (e.g., a function like build_compactor_summary_builder or get_run_compaction_builder) that returns the builder or its config, then update the test to call that helper and assert on the builder/config (check absence of tool_server_handle and that default_max_turns equals 1) rather than string-matching the source; update references to run_compaction_source and include_str! to use the new helper in assertions.src/agent/cortex.rs (1)
5062-5100: Tests cover the intended invariants.Newer-version requirement, debounce boundary (
1_050rejects,1_060accepts withdebounce=60), and clock-skew safety (now < last_change) are all exercised. Consider adding one more case fordebounce_secs == 0withnow == last_changeif that's a reachable config, since the clamp-then->=path returnstruethere — useful to pin down intentionally vs. accidentally.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/cortex.rs` around lines 5062 - 5100, Add a test that covers the edge-case when debounce_secs == 0 and now == last_change to assert the function returns true; specifically, in the test module that contains knowledge_synthesis_trigger_* tests, add a case calling should_regenerate_knowledge_synthesis_state(current_version > last_version, last_version, last_change, 0, last_change) and assert true so the clamp-then->= path is explicitly exercised and the behavior is pinned down.src/agent/channel.rs (1)
3734-3766: Nit: double-call pattern could be folded into one evaluation.The helper is called twice — once with
Noneto decide whether to query the DB, then again with the resolved count. Since the message/time triggers don't depend onevent_count_since_last, you can compute the trigger once and only consult the DB when the first result isNone:♻️ Proposed simplification
- let event_count_since_last = if memory_persistence_trigger_kind( - self.message_count, - elapsed.as_secs(), - None, - &wm_config, - ) - .is_none() - { - let since = chrono::Utc::now() - chrono::Duration::seconds(elapsed.as_secs() as i64); - match self - .deps - .working_memory - .count_events_since(self.id.as_ref(), since) - .await - { - Ok(count) => Some(count as usize), - Err(error) => { - tracing::debug!(%error, "event density check failed, skipping"); - None - } - } - } else { - None - }; - - let Some(trigger) = memory_persistence_trigger_kind( - self.message_count, - elapsed.as_secs(), - event_count_since_last, - &wm_config, - ) else { - return; - }; + let elapsed_secs = elapsed.as_secs(); + let trigger = match memory_persistence_trigger_kind( + self.message_count, + elapsed_secs, + None, + &wm_config, + ) { + Some(kind) => kind, + None => { + let since = chrono::Utc::now() - chrono::Duration::seconds(elapsed_secs as i64); + let event_count = match self + .deps + .working_memory + .count_events_since(self.id.as_ref(), since) + .await + { + Ok(count) => Some(count as usize), + Err(error) => { + tracing::debug!(%error, "event density check failed, skipping"); + None + } + }; + let Some(kind) = memory_persistence_trigger_kind( + self.message_count, + elapsed_secs, + event_count, + &wm_config, + ) else { + return; + }; + kind + } + };Functionally equivalent; just avoids recomputing the message/time predicates.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel.rs` around lines 3734 - 3766, The code calls memory_persistence_trigger_kind twice (once with None to decide whether to call deps.working_memory.count_events_since, then again with the resolved event_count_since_last); fold this into one evaluation by first calling memory_persistence_trigger_kind(self.message_count, elapsed.as_secs(), None, &wm_config) and if that returns None then await deps.working_memory.count_events_since(self.id.as_ref(), since) to get event_count_since_last and re-run memory_persistence_trigger_kind only once with the actual count, assigning the resulting Some(trigger) to trigger (preserving the existing tracing::debug on DB error and the early return when trigger is None).src/api/channels.rs (1)
776-795: Consider includingproject.idcontext in aggregate-level warnings too.Per-project
list_repos/list_worktrees_with_reposfailures correctly includeproject_idin the warning. For symmetry and easier triage, the aggregate-level renders (render_projects_contextat Line 829, top-levellist_projectsat Line 767) could log the number of projects being processed so an operator can correlate a template-render failure with the specific input size.This is purely a log-ergonomics nit — the current logs are already sufficient for correctness.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/api/channels.rs` around lines 776 - 795, The aggregate-level warning logs in render_projects_context and list_projects should include the number of projects being processed for better triage; update the tracing::warn calls that currently report template-render or aggregate failures to add a context field (e.g., projects_count or projects_len) with projects.len() (or the collection length) so operators can correlate failures with input size while keeping existing messages and error fields intact.src/tasks/store.rs (1)
1094-1119: Good snapshot-pair coverage; consider also asserting an unchanged field.The test verifies both mutated fields (title, priority) on the
previous_task/taskpair. As a low-effort strengthening, asserting an unchanged field too (e.g.previous_task.status == result.task.status == TaskStatus::Backlog, orassigned_agent_id) would lock in that the "before" snapshot isn't accidentally populated from the updated row in future refactors.🧪 Optional tightening
assert_eq!(result.previous_task.title, "old title"); assert_eq!(result.previous_task.priority, TaskPriority::Medium); + assert_eq!(result.previous_task.status, TaskStatus::Backlog); + assert_eq!(result.task.status, TaskStatus::Backlog); assert_eq!(result.task.title, "new title"); assert_eq!(result.task.priority, TaskPriority::High);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tasks/store.rs` around lines 1094 - 1119, The test update_result_returns_applied_snapshot_pair currently asserts changed fields but not an unchanged field, so add an assertion that a field not modified by the update (e.g., previous_task.status and task.status both equal TaskStatus::Backlog, or previous_task.assigned_agent_id == result.task.assigned_agent_id) to ensure the "previous_task" snapshot isn't accidentally updated; locate the test function update_result_returns_applied_snapshot_pair and add a concise assert_eq! comparing the unchanged field on result.previous_task and result.task to the expected value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agent/channel_dispatch.rs`:
- Around line 806-813: The code currently sanitizes the WorkerSpawned memory
event via sanitize_worker_memory_task but still emits or logs the raw task
elsewhere; replace all uses of the unsanitized task in
ProcessEvent::WorkerStarted payloads and tracing::info! calls with the sanitized
value (e.g., use the memory_task or a newly returned sanitized_task from
sanitize_worker_memory_task) so no raw task ever flows to
ProcessEvent::WorkerStarted, tracing::info!, or any other emitted events; apply
the same change to the other occurrence around the channel worker spawn logic
(the second block analogous to this one) to ensure consistency.
In `@src/secrets/scrub.rs`:
- Around line 262-278: redact_working_memory_encoded_leaks currently only calls
scan_for_leaks, which misses encoded forms of repo/tool secrets; update it to
also detect encoded variants of each secret from tool_secrets passed into
scrub_working_memory_text (e.g., base64, hex, percent-URL encodings) and treat a
match as a redact trigger. In practice modify scrub_working_memory_text to pass
tool_secrets into redact_working_memory_encoded_leaks (or call a new helper) and
implement detection by generating common encodings for each tool secret and
checking if any appear in text (or by decoding suspicious substrings and running
scrub_secrets/scan_for_leaks on the decoded content); if any encoded secret is
found, return the same "[WORKING_MEMORY_REDACTED:encoded-secret]" result. Ensure
function names referenced: scrub_working_memory_text,
redact_working_memory_encoded_leaks, scrub_secrets, and scan_for_leaks.
In `@src/tools/spawn_worker.rs`:
- Around line 525-530: The surrounding code logs and persists the unsanitized
task string (args.task) even though you created a sanitized memory_task; update
all logging and persistence calls to use the sanitized value instead of the raw
args.task. Specifically, replace uses of args.task in calls like
run_logger.log_worker_started(...) and any tracing/logging or worker_runs
persistence with memory_task (or a clearly named sanitized_task) so that
WorkingMemoryEventType::WorkerSpawned emit(...) already uses the sanitized text
and nothing else writes the original un-sanitized task; also audit any other
detached-worker persistence paths to ensure they consume
sanitize_worker_memory_task(...) output rather than args.task.
---
Nitpick comments:
In `@src/agent/channel.rs`:
- Around line 3734-3766: The code calls memory_persistence_trigger_kind twice
(once with None to decide whether to call
deps.working_memory.count_events_since, then again with the resolved
event_count_since_last); fold this into one evaluation by first calling
memory_persistence_trigger_kind(self.message_count, elapsed.as_secs(), None,
&wm_config) and if that returns None then await
deps.working_memory.count_events_since(self.id.as_ref(), since) to get
event_count_since_last and re-run memory_persistence_trigger_kind only once with
the actual count, assigning the resulting Some(trigger) to trigger (preserving
the existing tracing::debug on DB error and the early return when trigger is
None).
In `@src/agent/compactor.rs`:
- Around line 457-474: The current test
run_compaction_path_remains_toolless_and_one_turn brittlely scans the file
source with include_str! and substring searches; instead extract the
compactor/summary builder creation into a small helper (e.g., a function like
build_compactor_summary_builder or get_run_compaction_builder) that returns the
builder or its config, then update the test to call that helper and assert on
the builder/config (check absence of tool_server_handle and that
default_max_turns equals 1) rather than string-matching the source; update
references to run_compaction_source and include_str! to use the new helper in
assertions.
In `@src/agent/cortex.rs`:
- Around line 5062-5100: Add a test that covers the edge-case when debounce_secs
== 0 and now == last_change to assert the function returns true; specifically,
in the test module that contains knowledge_synthesis_trigger_* tests, add a case
calling should_regenerate_knowledge_synthesis_state(current_version >
last_version, last_version, last_change, 0, last_change) and assert true so the
clamp-then->= path is explicitly exercised and the behavior is pinned down.
In `@src/api/channels.rs`:
- Around line 776-795: The aggregate-level warning logs in
render_projects_context and list_projects should include the number of projects
being processed for better triage; update the tracing::warn calls that currently
report template-render or aggregate failures to add a context field (e.g.,
projects_count or projects_len) with projects.len() (or the collection length)
so operators can correlate failures with input size while keeping existing
messages and error fields intact.
In `@src/tasks/store.rs`:
- Around line 1094-1119: The test update_result_returns_applied_snapshot_pair
currently asserts changed fields but not an unchanged field, so add an assertion
that a field not modified by the update (e.g., previous_task.status and
task.status both equal TaskStatus::Backlog, or previous_task.assigned_agent_id
== result.task.assigned_agent_id) to ensure the "previous_task" snapshot isn't
accidentally updated; locate the test function
update_result_returns_applied_snapshot_pair and add a concise assert_eq!
comparing the unchanged field on result.previous_task and result.task to the
expected value.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d8248863-fac4-439e-af6b-788ed8194c45
📒 Files selected for processing (12)
docs/design-docs/working-memory-triage.mdsrc/agent/channel.rssrc/agent/channel_dispatch.rssrc/agent/compactor.rssrc/agent/cortex.rssrc/api/channels.rssrc/cron/scheduler.rssrc/memory/working.rssrc/secrets/scrub.rssrc/tasks/store.rssrc/tools/spawn_worker.rssrc/tools/task_update.rs
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/agent/channel.rs (1)
3724-3766:⚠️ Potential issue | 🔴 CriticalRemove the stale
message_interval == 0check from the persistence trigger path.
check_memory_persistence()delegates trigger selection tomemory_persistence_trigger_kind(), which usesWorkingMemoryConfigthresholds (persistence_message_threshold,persistence_time_threshold_secs,persistence_event_density_threshold). However, the early guard still returns whenmessage_interval == 0, bypassing the new trigger system entirely. Users who setmessage_interval: 0to disable the old cadence-based persistence will now lose all persistence triggers, including time and event-density checks. The config comments already acknowledge this field is legacy; remove the gate so the new thresholds control trigger behavior independently.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel.rs` around lines 3724 - 3766, The early-return guard in check_memory_persistence currently blocks the new trigger system by returning when config.message_interval == 0; remove that condition so the guard only checks config.enabled and self.resolved_settings.memory.persistence_enabled() (i.e., stop using config.message_interval in the initial return), allowing memory_persistence_trigger_kind(...) to handle message/time/event thresholds; update the guard around self.deps.runtime_config.memory_persistence.load() accordingly and leave the downstream call to memory_persistence_trigger_kind, the fallback event-density check that uses self.deps.working_memory.count_events_since, unchanged.src/agent/channel_dispatch.rs (1)
758-781:⚠️ Potential issue | 🟠 MajorExtract duplicate-detection key from task before sanitization.
The sanitization pipeline truncates tasks to 500 characters and redacts secrets/leaks, then stores this sanitized version in
StatusBlock. However, duplicate detection still checks the original full task against theStatusBlockcontents. For tasks longer than 500 characters, the truncated version inStatusBlockwill not match the original task on the next duplicate check, allowing identical long tasks to spawn multiple workers.This affects the builtin spawn path (line 781), OpenCode spawn (line 994), resumed OpenCode workers (line 1263), and resumed builtin workers (line 1393). The reservation mechanism uses the original task string but is released immediately after spawn succeeds, allowing the race condition to manifest.
Compute a stable duplicate-detection key (using the original task or a deterministic normalization) before sanitization, and pass both the key and sanitized display text to
add_worker. Modifyfind_duplicate_worker_taskto compare against the stored keys instead of the display text.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel_dispatch.rs` around lines 758 - 781, Compute a stable duplicate-detection key from the original task BEFORE calling crate::secrets::scrub::scrub_worker_task_for_memory (e.g., hash or normalized string) and pass both that key and the sanitized display text into StatusBlock::add_worker instead of only the sanitized text; update calls that create workers (the builtin path around spawn_worker_task and the OpenCode/resume paths mentioned) to supply the pre-sanitization key, and change find_duplicate_worker_task to compare against the stored duplicate keys in StatusBlock rather than the display text so truncated/redacted display text no longer prevents duplicate detection.
🧹 Nitpick comments (3)
src/agent/compactor.rs (1)
34-43:tool_server_enabledis not wired into the builder.
CompactionPromptPolicy::tool_server_enabledis read only by the unit test;build_compaction_summary_agentnever consumes it. If a future change flips the flag totrue, the builder will silently keep producing a toolless agent, and the "toolless" test will still pass while giving a false sense of enforcement. Either drop the field and have the test assert the agent shape directly, or branch on the flag in the builder (e.g., attach atool_server_handlewhentrue).♻️ Suggested tightening
fn build_compaction_summary_agent( model: SpacebotModel, compactor_prompt: &str, ) -> rig::agent::Agent<SpacebotModel> { let policy = compaction_prompt_policy(); + debug_assert!( + !policy.tool_server_enabled, + "compactor must remain toolless" + ); AgentBuilder::new(model) .preamble(compactor_prompt) .default_max_turns(policy.max_turns) .build() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/compactor.rs` around lines 34 - 43, The builder in build_compaction_summary_agent ignores CompactionPromptPolicy::tool_server_enabled (returned by compaction_prompt_policy()), so either wire that flag into the AgentBuilder or remove the flag and update the test; specifically, fetch the policy via compaction_prompt_policy(), check policy.tool_server_enabled, and if true attach the tool server handle to the AgentBuilder (e.g., call the builder method that sets tool_server_handle or equivalent) before calling .build(), otherwise leave current toolless build; alternatively, remove tool_server_enabled from CompactionPromptPolicy and update the unit test to assert the agent shape directly.src/tasks/store.rs (1)
1094-1121: Test covers the new snapshot pair; consider asserting unchanged status.The test verifies title/priority deltas and that status remains
Backlog. Optionally assertresult.previous_status == result.task.statushere to pin down the invariant that a non-status update leavesprevious_statusequal to the post-update status, preventing regressions in downstream diff logic that keys offprevious_status.♻️ Optional addition
assert_eq!(result.task.priority, TaskPriority::High); + assert_eq!(result.previous_status, TaskStatus::Backlog); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tasks/store.rs` around lines 1094 - 1121, The test update_result_returns_applied_snapshot_pair should also assert that the previous and post-update statuses are identical for non-status-only updates: add an assertion comparing result.previous_task.status and result.task.status (e.g., assert_eq!(result.previous_task.status, result.task.status)) alongside the existing status/title/priority checks to lock in the invariant used by downstream diff logic.src/agent/cortex.rs (1)
5062-5107: Good coverage of the four critical branches.Tests exercise equality/older version (no regen), within/at debounce boundary, backward clock skew, and zero-debounce-at-instant. One optional addition worth considering: a case where
last_change_unix_secs == 0(atomic default) with a fresh version bump — to document/pin the behavior that a never-initialized timestamp withcurrent > lastimmediately triggers regen once debounce elapses since the epoch (effectively "fire now"). Not a correctness issue given thatbump_knowledge_synthesis_versionalways stores the timestamp before incrementing the version, but the guarantee is worth locking in.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/cortex.rs` around lines 5062 - 5107, Add a test covering the case when last_change_unix_secs == 0 to document behavior when the timestamp is uninitialized: call should_regenerate_knowledge_synthesis_state with current_version > last_version and last_change_unix_secs set to 0 (use a debounce_secs value and a now_unix_secs such that now >= debounce_secs) and assert that regen triggers appropriately; reference the function should_regenerate_knowledge_synthesis_state and, optionally, mention bump_knowledge_synthesis_version in the test name or comment to show this scenario arises from the atomic default behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/agent/channel_dispatch.rs`:
- Around line 758-781: Compute a stable duplicate-detection key from the
original task BEFORE calling crate::secrets::scrub::scrub_worker_task_for_memory
(e.g., hash or normalized string) and pass both that key and the sanitized
display text into StatusBlock::add_worker instead of only the sanitized text;
update calls that create workers (the builtin path around spawn_worker_task and
the OpenCode/resume paths mentioned) to supply the pre-sanitization key, and
change find_duplicate_worker_task to compare against the stored duplicate keys
in StatusBlock rather than the display text so truncated/redacted display text
no longer prevents duplicate detection.
In `@src/agent/channel.rs`:
- Around line 3724-3766: The early-return guard in check_memory_persistence
currently blocks the new trigger system by returning when
config.message_interval == 0; remove that condition so the guard only checks
config.enabled and self.resolved_settings.memory.persistence_enabled() (i.e.,
stop using config.message_interval in the initial return), allowing
memory_persistence_trigger_kind(...) to handle message/time/event thresholds;
update the guard around self.deps.runtime_config.memory_persistence.load()
accordingly and leave the downstream call to memory_persistence_trigger_kind,
the fallback event-density check that uses
self.deps.working_memory.count_events_since, unchanged.
---
Nitpick comments:
In `@src/agent/compactor.rs`:
- Around line 34-43: The builder in build_compaction_summary_agent ignores
CompactionPromptPolicy::tool_server_enabled (returned by
compaction_prompt_policy()), so either wire that flag into the AgentBuilder or
remove the flag and update the test; specifically, fetch the policy via
compaction_prompt_policy(), check policy.tool_server_enabled, and if true attach
the tool server handle to the AgentBuilder (e.g., call the builder method that
sets tool_server_handle or equivalent) before calling .build(), otherwise leave
current toolless build; alternatively, remove tool_server_enabled from
CompactionPromptPolicy and update the unit test to assert the agent shape
directly.
In `@src/agent/cortex.rs`:
- Around line 5062-5107: Add a test covering the case when last_change_unix_secs
== 0 to document behavior when the timestamp is uninitialized: call
should_regenerate_knowledge_synthesis_state with current_version > last_version
and last_change_unix_secs set to 0 (use a debounce_secs value and a
now_unix_secs such that now >= debounce_secs) and assert that regen triggers
appropriately; reference the function
should_regenerate_knowledge_synthesis_state and, optionally, mention
bump_knowledge_synthesis_version in the test name or comment to show this
scenario arises from the atomic default behavior.
In `@src/tasks/store.rs`:
- Around line 1094-1121: The test update_result_returns_applied_snapshot_pair
should also assert that the previous and post-update statuses are identical for
non-status-only updates: add an assertion comparing result.previous_task.status
and result.task.status (e.g., assert_eq!(result.previous_task.status,
result.task.status)) alongside the existing status/title/priority checks to lock
in the invariant used by downstream diff logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3b0c05db-19da-440c-a9ce-897215062cbb
📒 Files selected for processing (8)
src/agent/channel.rssrc/agent/channel_dispatch.rssrc/agent/compactor.rssrc/agent/cortex.rssrc/api/channels.rssrc/secrets/scrub.rssrc/tasks/store.rssrc/tools/spawn_worker.rs
🚧 Files skipped from review as they are similar to previous changes (3)
- src/tools/spawn_worker.rs
- src/secrets/scrub.rs
- src/api/channels.rs
This pull request focuses on improving the safety, correctness, and testability of working memory and knowledge synthesis features. It addresses several bug reports and review findings, particularly around sensitive data handling, event triggers, and regeneration logic. Key changes include sanitizing persisted worker tasks, refactoring memory persistence triggers for clarity and testability, and improving the debounce logic for knowledge synthesis regeneration. Comprehensive tests have been added for new and existing logic.
Working Memory Safety and Sanitization:
Memory Persistence Trigger Refactor:
memory_persistence_trigger_kind) for clarity and reuse, with updated logic to prefer message count, then time, then event density. This is now thoroughly unit tested. [1] [2] [3] [4]Knowledge Synthesis Regeneration Logic:
Bug Fixes and Reporting:
inspect_promptby logging and propagating errors per guidelines.Compactor and Miscellaneous:
These changes collectively enhance the reliability, security, and maintainability of the system's memory and synthesis features.