File tree Expand file tree Collapse file tree 6 files changed +515
-18
lines changed
Expand file tree Collapse file tree 6 files changed +515
-18
lines changed Original file line number Diff line number Diff line change @@ -133,13 +133,17 @@ expect:
133133 contains : missing auth check
134134 severity : error
135135 category : security
136- rule_id : sec.auth.guard
136+ rule_id : sec.auth.guard # label for per-rule precision/recall
137+ require_rule_id : false # set true to require model-emitted RULE id
137138 must_not_find :
138139 - contains : style
139140 min_total : 1
140141 max_total : 8
141142` ` `
142143
144+ ` diffscope eval` now reports per-rule precision/recall/F1 (micro and macro), and includes top rule-level TP/FP/FN counts in CLI and JSON output.
145+ Starter fixtures live in `eval/fixtures/repo_regressions`.
146+
143147# ## Smart Review (Enhanced Analysis)
144148` ` ` bash
145149# Get professional-grade analysis with confidence scoring
Original file line number Diff line number Diff line change 1+ # Eval Fixtures
2+
3+ Starter fixture set for ` diffscope eval ` .
4+
5+ - ` repo_regressions/ ` contains regression-style diffs based on realistic mistakes in this codebase.
6+ - Each fixture can include ` rule_id ` as a label for rule-level precision/recall metrics.
7+ - Set ` require_rule_id: true ` on a pattern if the rule id must be emitted by the model for a match.
8+
9+ Run:
10+
11+ ``` bash
12+ diffscope eval --fixtures eval/fixtures --output eval-report.json
13+ ```
14+
15+ Notes:
16+ - Fixtures call the configured model and API provider; they are not deterministic unit tests.
17+ - Treat this set as a baseline and tighten ` must_find ` /` must_not_find ` thresholds over time.
Original file line number Diff line number Diff line change 1+ name : repo regression - missing RawComment rule_id initializer
2+ repo_path : ../../..
3+ diff_file : ./raw_comment_missing_field.patch
4+ expect :
5+ must_find :
6+ - file : src/main.rs
7+ contains : rule_id
8+ rule_id : compile.rawcomment.rule_id
9+ min_total : 1
10+ max_total : 8
Original file line number Diff line number Diff line change 1+ name : repo regression - shell injection in debug helper
2+ repo_path : ../../..
3+ diff_file : ./shell_injection.patch
4+ expect :
5+ must_find :
6+ - file : src/main.rs
7+ contains : injection
8+ rule_id : sec.shell.injection
9+ must_not_find :
10+ - contains : style
11+ min_total : 1
12+ max_total : 12
Original file line number Diff line number Diff line change 1+ name : repo regression - unwrap panic in confidence parser
2+ repo_path : ../../..
3+ diff_file : ./smart_confidence_unwrap.patch
4+ expect :
5+ must_find :
6+ - file : src/main.rs
7+ contains : unwrap
8+ rule_id : reliability.unwrap_panic
9+ must_not_find :
10+ - contains : style
11+ min_total : 1
12+ max_total : 10
You can’t perform that action at this time.
0 commit comments