feat: capture memcache snapshot and tool log on session failure#217
Closed
sheeki03 wants to merge 6 commits intoGitHubSecurityLab:mainfrom
Closed
feat: capture memcache snapshot and tool log on session failure#217sheeki03 wants to merge 6 commits intoGitHubSecurityLab:mainfrom
sheeki03 wants to merge 6 commits intoGitHubSecurityLab:mainfrom
Conversation
Adds _tool_call_counter, _auto_save_interval, _auto_save_dir, _write_auto_save, and _read_tool_log to run_main. When AUTO_SAVE_DIR and AUTO_SAVE_INTERVAL are set, tool results are periodically appended to an NDJSON log file. Disabled by default (interval=0).
Moves write_auto_save() and read_tool_log() from closures inside run_main() to module-level functions with explicit parameters. Tests now exercise the real implementation instead of duplicating the logic.
- Add encoding="utf-8" to open() in write_auto_save and read_tool_log - Catch ValueError on non-numeric AUTO_SAVE_INTERVAL with fallback to 0 - Soften docstring from "crash-safe" to "append-only"
Adds memcache_snapshot and tool_log_snapshot fields to TaskflowSession. On mark_failed, the runner captures current memcache state via snapshot_state() and the auto-save tool log for post-mortem inspection. Adds snapshot_state() to sqlite and dictionary_file backends.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When
mark_failed()is called after a crash or retry exhaustion, only the error string is saved. There is no record of what the agent found (memcache state) or what tools it called (tool log). Post-mortem inspection requires re-running the entire workflow.Depends on: #216 (auto-save scaffolding)
Changes
Session model (
session.py)Adds two new fields to
TaskflowSession:memcache_snapshot:dict[str, Any]— full memcache state at failure timetool_log_snapshot:list[dict[str, Any]]— auto-save tool log entries at failure timeBoth default to empty (backward-compatible with existing session JSON files).
mark_failed()accepts optionalmemcache_snapshotandtool_log_snapshotparameters.Backend snapshot methods
Adds
snapshot_state()to both memcache backends:SqliteBackend: Queries all distinct keys, returns merged values. For_log:prefixed keys, usesget_log()when available (PR Bump authlib from 1.6.3 to 1.6.4 #4 adds it) or falls back toget_state().MemcacheDictionaryFileBackend: Inflates from disk, returns acopy.deepcopy()of the in-memory dict. Deep copy prevents callers from accidentally mutating backend state through nested references.Runner wiring (
runner.py)Both
mark_failedcall sites (retry exhaustion andmust_completefailure) now capture:_snapshot_memcache_state()— instantiates the appropriate backend and callssnapshot_state()read_tool_log(_auto_save_dir)— reads the NDJSON auto-save logTests
TestSessionForensics(3 tests): round-trip with snapshots, backward-compatible without snapshots, old session JSON without new fields loadsTestSnapshotStateSqlite(2 tests): all keys returned, empty DBTestSnapshotStateDictFile(2 tests): deep copy verified (nested mutation doesn't affect backend), empty state