Commit 984d50c
committed
feat(burn): wire ndarray hpc::vml SIMD into float_exp/log/sqrt/abs
First augmentation of the burn backend with our crate::simd F32x16 path.
For contiguous f32 tensors, these operations now route through
ndarray::hpc::vml which uses crate::simd::F32x16 (AVX-512/AVX2 via
LazyLock dispatch). Non-f32 or non-contiguous tensors fall through
to the original scalar mapv_into path.
float_exp → ndarray::hpc::vml::vsexp (F32x16 polynomial approx)
float_log → ndarray::hpc::vml::vsln (F32x16 polynomial approx)
float_sqrt → ndarray::hpc::vml::vssqrt (F32x16 hardware sqrt)
float_abs → ndarray::hpc::vml::vsabs (F32x16 bitmask)
try_vml_unary() helper:
- Checks tensor is F32 variant + contiguous layout
- Extracts &[f32] slice (zero-copy read)
- Calls VML function → Vec<f32> output
- Wraps into NdArrayTensor::F32(Owned)
- Falls through to scalar on non-f32/non-contiguous
30 tests passing. Zero regressions.
https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o71 parent 129a959 commit 984d50c
1 file changed
Lines changed: 50 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
35 | 63 | | |
36 | 64 | | |
37 | 65 | | |
| |||
446 | 474 | | |
447 | 475 | | |
448 | 476 | | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
449 | 484 | | |
450 | 485 | | |
451 | 486 | | |
452 | 487 | | |
453 | 488 | | |
454 | 489 | | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
455 | 495 | | |
456 | 496 | | |
457 | 497 | | |
| |||
499 | 539 | | |
500 | 540 | | |
501 | 541 | | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
502 | 547 | | |
503 | 548 | | |
504 | 549 | | |
505 | 550 | | |
506 | 551 | | |
507 | 552 | | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
508 | 558 | | |
509 | 559 | | |
510 | 560 | | |
| |||
0 commit comments