Skip to content

feat: add auto-save scaffolding for periodic tool-result logging#216

Closed
sheeki03 wants to merge 5 commits intoGitHubSecurityLab:mainfrom
sheeki03:feat/runner-tool-autosave
Closed

feat: add auto-save scaffolding for periodic tool-result logging#216
sheeki03 wants to merge 5 commits intoGitHubSecurityLab:mainfrom
sheeki03:feat/runner-tool-autosave

Conversation

@sheeki03
Copy link
Copy Markdown

Problem

When a long-running taskflow crashes or is interrupted, all intermediate tool results are lost. There is no way to inspect what the agent found before the failure. This makes post-mortem debugging and forensic recovery difficult, especially for security audit workflows that may run for 45+ minutes.

Changes

Auto-save infrastructure (runner.py)

Adds opt-in periodic tool-result logging via two environment variables:

  • AUTO_SAVE_DIR: Directory where logs are written (empty = disabled)
  • AUTO_SAVE_INTERVAL: Write every N tool calls (0 = disabled)

When both are set, every Nth tool completion appends an NDJSON entry to {AUTO_SAVE_DIR}/auto_save_tool_log.ndjson containing:

  • turn: tool call counter
  • tool: tool name
  • result_preview: first 2000 chars of the result

Design decisions:

  • NDJSON format: Each entry is a single line. Append-only — no read-modify-write, no corruption on crash mid-write, safe under concurrent tool completions. If the process dies mid-write, only the last line is truncated; all prior lines remain valid JSON.
  • Module-level functions: write_auto_save() and read_tool_log() are extracted as importable, testable functions with explicit parameters. The on_tool_end_hook closure delegates to them.
  • Zero overhead when disabled: AUTO_SAVE_INTERVAL=0 (default) short-circuits before the modulo check. Invalid (non-numeric) interval values are caught and fall back to 0 with a warning.
  • Explicit UTF-8 encoding: All file I/O uses encoding="utf-8" for cross-platform consistency.

README documentation

Documents AUTO_SAVE_DIR and AUTO_SAVE_INTERVAL in the Session Recovery section.

Tests (test_runner.py, 7 new tests)

All tests call the real write_auto_save and read_tool_log functions:

  • Write-then-read roundtrip
  • NDJSON entry format validation (turn, tool, result_preview keys)
  • Result truncation to 2000 chars
  • Write failure suppression (mocked OSError)
  • Corrupt trailing line recovery
  • Empty directory returns []
  • Disabled when dir is empty string

Adds _tool_call_counter, _auto_save_interval, _auto_save_dir,
_write_auto_save, and _read_tool_log to run_main. When AUTO_SAVE_DIR
and AUTO_SAVE_INTERVAL are set, tool results are periodically appended
to an NDJSON log file. Disabled by default (interval=0).
Moves write_auto_save() and read_tool_log() from closures inside
run_main() to module-level functions with explicit parameters. Tests
now exercise the real implementation instead of duplicating the logic.
- Add encoding="utf-8" to open() in write_auto_save and read_tool_log
- Catch ValueError on non-numeric AUTO_SAVE_INTERVAL with fallback to 0
- Soften docstring from "crash-safe" to "append-only"
Comment on lines +69 to +70
os.makedirs(auto_save_dir, exist_ok=True)
save_path = os.path.join(auto_save_dir, AUTO_SAVE_LOG_NAME)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have utilities for generating files and directories for logging in path_utils.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants