Commit f5f6da6
refactor(gatekeeper): async eval runner with structured classes and stats tracking
- Make check_run_script async (acompletion) and add check_run_script_with_stats
returning GatekeeperStats (tokens, cost, latency) alongside results
- Introduce EvalSuite and FileEval classes in run-eval.py, replacing global
variables and eliminating duplicate single-file/multi-file code paths
- Load all test cases up front so progress output shows global numbering
(e.g. [3/121]) as individual evals complete
- Add StatsAggregator dataclass for cleaner inference statistics reporting
- Add GatekeeperException with optional stats for timeout, output limit,
and parse failures
- Add gatekeeper.cost config option for custom per-token cost accounting
- Switch summary/stats tables from plain text to rich.Table.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent 9ffa5e3 commit f5f6da6
6 files changed
Lines changed: 517 additions & 218 deletions
File tree
- eval/gatekeeper
- src/linux_mcp_server
- gatekeeper
- tools
- tests
- gatekeeper
0 commit comments