Commit c4daefe
committed
Regenerate FROZEN_EVAL_SCORED.jsonl with shipped V43
The score and tier fields were precomputed from an older model in the
v0.0.1 release. Re-scored every row with the currently shipped ONNX
(models/l5e/model.int8.onnx) so scripts/bench_oss.py compares the
correct numbers.
On the same eval slice (791 attacks / 132 benigns):
recall@0.95 83.94% (was 33.63% with stale scores)
FPR@0.95 10.61% (was 25.00%)1 parent d1bb1fc commit c4daefe
1 file changed
Lines changed: 923 additions & 923 deletions
0 commit comments