Commit f5f6da6

and

committed

refactor(gatekeeper): async eval runner with structured classes and stats tracking

- Make check_run_script async (acompletion) and add check_run_script_with_stats returning GatekeeperStats (tokens, cost, latency) alongside results - Introduce EvalSuite and FileEval classes in run-eval.py, replacing global variables and eliminating duplicate single-file/multi-file code paths - Load all test cases up front so progress output shows global numbering (e.g. [3/121]) as individual evals complete - Add StatsAggregator dataclass for cleaner inference statistics reporting - Add GatekeeperException with optional stats for timeout, output limit, and parse failures - Add gatekeeper.cost config option for custom per-token cost accounting - Switch summary/stats tables from plain text to rich.Table. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

1 parent 9ffa5e3 commit f5f6da6Copy full SHA for f5f6da6

6 files changed

eval/gatekeeper
- run-eval.py
src/linux_mcp_server
- config.py
- gatekeeper
  - check_run_script.py
- tools
  - run_script.py
tests
- gatekeeper
  - test_check_run_script.py
- test_config.py

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit f5f6da6

Uh oh!

File tree

0 commit comments