Skip to content

feat: add memcache append-only log tools and fix set_state return bug#218

Closed
sheeki03 wants to merge 8 commits intoGitHubSecurityLab:mainfrom
sheeki03:feat/memcache-append-log
Closed

feat: add memcache append-only log tools and fix set_state return bug#218
sheeki03 wants to merge 8 commits intoGitHubSecurityLab:mainfrom
sheeki03:feat/memcache-append-log

Conversation

@sheeki03
Copy link
Copy Markdown

Problem

memcache_set_state is destructive — each call overwrites the previous value. During long security audits, the agent repeatedly calls set_state to store findings, and earlier discoveries are lost when later ones overwrite the same key. There is no append-only accumulation primitive.

Separately, SqliteBackend.set_state() returns a literal string 'f"Stored value in memory for \{key}`"'` instead of the interpolated f-string, so callers see the raw template instead of the key name.

Depends on: #217 (session failure forensics, which touches the same backend files)

Changes

New MCP tools (memcache.py)

  • memcache_append_log(key, entry): Appends a timestamped entry to an append-only log under _log:{key}. Use instead of set_state when accumulating findings.
  • memcache_get_log(key): Retrieves all entries as a JSON array, ordered by insertion time.

Backend implementations

SQLite (sqlite.py):

  • append_log(): Single INSERT per entry — no read-modify-write, atomic under concurrent writers. Each row stores {"_ts": ..., "data": entry} as JSON.
  • get_log(): Queries all rows for _log:{key} ordered by ID, returns flat list. Does NOT use get_state() (which has a multi-row dict merge bug for this pattern).
  • snapshot_state(): Updated to use get_log() for _log: prefixed keys.

Dictionary file (dictionary_file.py):

  • append_log(): Appends to an in-memory list via the with_memory decorator (inflate-mutate-deflate). Sequentially correct within a single process; NOT safe under concurrent writers (matches existing backend guarantees).
  • get_log(): Reads the list, returns [] for missing keys.

Bug fix (Fix A)

sqlite.py line 32: Changed 'f"Stored value in memory for \{key}`"'tof"Stored value in memory for `{key}`"` (actual f-string interpolation).

Tests (test_memcache_backend.py, 12 new tests)

  • TestAppendLogSqlite: first append creates list, sequential appends accumulate, entries have timestamps, ordering preserved, empty key returns [], 50 sequential appends with no data loss, concurrent threaded appends (5 threads × 20 entries = 100 total)
  • TestAppendLogDictFile: first append, sequential appends, empty key
  • TestSetStateReturnFix: regression test for Fix A
  • TestSetStateUnchanged: set_state still replaces values correctly

README documentation

Documents memcache_append_log and memcache_get_log in the Toolboxes section.

Adds _tool_call_counter, _auto_save_interval, _auto_save_dir,
_write_auto_save, and _read_tool_log to run_main. When AUTO_SAVE_DIR
and AUTO_SAVE_INTERVAL are set, tool results are periodically appended
to an NDJSON log file. Disabled by default (interval=0).
Moves write_auto_save() and read_tool_log() from closures inside
run_main() to module-level functions with explicit parameters. Tests
now exercise the real implementation instead of duplicating the logic.
- Add encoding="utf-8" to open() in write_auto_save and read_tool_log
- Catch ValueError on non-numeric AUTO_SAVE_INTERVAL with fallback to 0
- Soften docstring from "crash-safe" to "append-only"
Adds memcache_snapshot and tool_log_snapshot fields to TaskflowSession.
On mark_failed, the runner captures current memcache state via
snapshot_state() and the auto-save tool log for post-mortem inspection.
Adds snapshot_state() to sqlite and dictionary_file backends.
Adds memcache_append_log and memcache_get_log MCP tools for accumulating
findings without overwriting. Each append is a single INSERT (sqlite) or
list append (dict_file), avoiding read-modify-write races.

Also fixes sqlite set_state returning a literal f-string instead of an
interpolated one (line 32).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants