Skip to content

Commit 26da36f

Browse files
committed
Drop fmops from benchmarks: predicts positive for everything
1 parent ec90d21 commit 26da36f

3 files changed

Lines changed: 4 additions & 15 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ own products.
8686
- **From-scratch.** No teacher weights from any vendor classifier are
8787
redistributed.
8888
- **Benchmarked against public datasets** for direct comparison with OSS
89-
baselines (ProtectAI v2, deepset, fmops, Meta Prompt-Guard-2). Held-out
89+
baselines (ProtectAI v2, deepset, Meta Prompt-Guard, Meta Prompt-Guard-2). Held-out
9090
evaluation; false positives reported alongside recall.
9191
- **MIT-licensed weights.** Use in production, paid or free.
9292

docs/BENCHMARKS.md

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -67,14 +67,12 @@ bottom-tier, blank = mid.
6767
| **promptpurify** | **83.94% ↑** | **10.61% ↑** | **87.10% ↑** | **12.88% ↑** |
6868
| ProtectAI v2 | 40.71% ↓ | 43.18% ↓ | 40.71% ↓ | 43.18% ↓ |
6969
| deepset | 97.22% ↑ | 59.85% ↓ | 97.22% ↑ | 59.85% ↓ |
70-
| fmops | 100.00% ↑ | 100.00% ↓ | 100.00% ↑ | 100.00% ↓ |
7170
| Meta Prompt-Guard | 67.00% | 88.64% ↓ | 67.00% | 88.64% ↓ |
7271
| Meta Prompt-Guard-2 | 12.77% ↓ | 1.52% ↑ | 12.77% ↓ | 1.52% ↑ |
7372

74-
`promptpurify` is the only row with ↑ on every column. `fmops` "wins"
75-
recall by predicting positive for every input — its FPR ↓ shows it's
76-
mis-calibrated, not skilled. `Meta Prompt-Guard-2` flips the trade:
77-
nearly-zero FPR at the cost of catching ~1 in 8 attacks.
73+
`promptpurify` is the only row with ↑ on every column.
74+
`Meta Prompt-Guard-2` flips the trade: nearly-zero FPR at the cost of
75+
catching ~1 in 8 attacks.
7876

7977
How to read this:
8078

@@ -85,9 +83,6 @@ How to read this:
8583
on this slice. `deepset` reaches higher recall but at ~6x the FPR
8684
(60% of benigns blocked); for most production traffic that's worse,
8785
not better.
88-
- `fmops` predicts the positive class for every input on this slice.
89-
Treat the row as evidence the model is mis-calibrated for this
90-
distribution, not as a real recall claim.
9186
- `Meta Prompt-Guard` is a 3-class model; we score it as
9287
`P(INJECTION) + P(JAILBREAK)` (see `scripts/bench_oss.py`).
9388

scripts/bench_oss.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -69,12 +69,6 @@ class ModelSpec:
6969
injection_label="INJECTION",
7070
default_threshold=0.5,
7171
),
72-
ModelSpec(
73-
name="fmops",
74-
hf_id="fmops/distilbert-prompt-injection",
75-
injection_label="INJECTION",
76-
default_threshold=0.5,
77-
),
7872
ModelSpec(
7973
name="Meta Prompt-Guard",
8074
hf_id="meta-llama/Prompt-Guard-86M",

0 commit comments

Comments
 (0)