Commit b979eed
committed
MFU: use padded_vocab_size for mfu_padded_pct LM-head FLOPs
For configs with padded_vocab_size set (ESM-2: 33→64 for FP8/tensor-core
friendliness), the LM-head matmul physically runs at padded width and the
logits are sliced back afterward. Count the padded width in the hardware-view
metric (mfu_padded_pct, tflops_per_gpu_padded) while continuing to count raw
vocab_size in the useful-work metric (mfu_pct, tflops_per_gpu). For configs
without padded_vocab_size (llama3, og2, codonfm) the two values collapse and
nothing changes.
Addresses review feedback from @trvachov on PR #1548.
Signed-off-by: Gagan Kaushik <gkaushik@nvidia.com>1 parent 423eab7 commit b979eed
4 files changed
Lines changed: 40 additions & 8 deletions
File tree
- bionemo-recipes/recipes
- codonfm_native_te
- esm2_native_te
- llama3_native_te
- opengenome2_llama_native_te
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
69 | | - | |
| 69 | + | |
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
| |||
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
84 | 88 | | |
85 | 89 | | |
86 | 90 | | |
| |||
192 | 196 | | |
193 | 197 | | |
194 | 198 | | |
| 199 | + | |
195 | 200 | | |
196 | 201 | | |
197 | 202 | | |
198 | 203 | | |
199 | 204 | | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
200 | 208 | | |
201 | 209 | | |
202 | 210 | | |
| |||
348 | 356 | | |
349 | 357 | | |
350 | 358 | | |
351 | | - | |
| 359 | + | |
352 | 360 | | |
353 | 361 | | |
354 | 362 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
| 68 | + | |
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
83 | 87 | | |
84 | 88 | | |
85 | 89 | | |
| |||
195 | 199 | | |
196 | 200 | | |
197 | 201 | | |
| 202 | + | |
198 | 203 | | |
199 | 204 | | |
200 | 205 | | |
201 | 206 | | |
202 | 207 | | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
203 | 211 | | |
204 | 212 | | |
205 | 213 | | |
| |||
357 | 365 | | |
358 | 366 | | |
359 | 367 | | |
360 | | - | |
| 368 | + | |
361 | 369 | | |
362 | 370 | | |
363 | 371 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
66 | | - | |
| 66 | + | |
67 | 67 | | |
68 | 68 | | |
69 | 69 | | |
| |||
78 | 78 | | |
79 | 79 | | |
80 | 80 | | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
81 | 85 | | |
82 | 86 | | |
83 | 87 | | |
| |||
201 | 205 | | |
202 | 206 | | |
203 | 207 | | |
| 208 | + | |
204 | 209 | | |
205 | 210 | | |
206 | 211 | | |
207 | 212 | | |
208 | 213 | | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
209 | 217 | | |
210 | 218 | | |
211 | 219 | | |
| |||
384 | 392 | | |
385 | 393 | | |
386 | 394 | | |
387 | | - | |
| 395 | + | |
388 | 396 | | |
389 | 397 | | |
390 | 398 | | |
| |||
Lines changed: 10 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
74 | | - | |
| 74 | + | |
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
89 | 93 | | |
90 | 94 | | |
91 | 95 | | |
| |||
197 | 201 | | |
198 | 202 | | |
199 | 203 | | |
| 204 | + | |
200 | 205 | | |
201 | 206 | | |
202 | 207 | | |
203 | 208 | | |
204 | 209 | | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
205 | 213 | | |
206 | 214 | | |
207 | 215 | | |
| |||
373 | 381 | | |
374 | 382 | | |
375 | 383 | | |
376 | | - | |
| 384 | + | |
377 | 385 | | |
378 | 386 | | |
379 | 387 | | |
| |||
0 commit comments