Commit 09b2124
authored
scripts(dflash): switch default bench target to Q8_0 + --target flag (#65)
Per Markus 2026-06-04: DFlash quality measurement should use a Q8_0
target rather than Q4_K_M, since Q4_K_M introduces enough target-side
quantization noise to confound DFlash's own accept-rate signal. Q8_0
fits in 38 GB total, well within titan A100 80 GB.
* Default `TARGET` is now `gemma-4-31B-it-Q8_0.gguf`. Override via
`--target PATH` or `DFLASH_BENCH_TARGET` env var.
* Also added `DFLASH_BENCH_DRAFTER_DIR` env var for consistency.
* Comment block documents VRAM math for Q4_K_M / Q8_0 / BF16 targets
so future runs can pick the right card.1 parent b0daec5 commit 09b2124
1 file changed
Lines changed: 14 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
10 | | - | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
11 | 14 | | |
12 | 15 | | |
13 | | - | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
14 | 22 | | |
15 | 23 | | |
16 | 24 | | |
| |||
25 | 33 | | |
26 | 34 | | |
27 | 35 | | |
28 | | - | |
29 | | - | |
| 36 | + | |
| 37 | + | |
30 | 38 | | |
31 | 39 | | |
32 | 40 | | |
| |||
36 | 44 | | |
37 | 45 | | |
38 | 46 | | |
| 47 | + | |
39 | 48 | | |
40 | 49 | | |
41 | 50 | | |
| |||
0 commit comments