Commit 8d3f6bc
committed
feat(burn): fused SIMD sigmoid via hpc::activations::sigmoid_f32
Override ActivationOps::sigmoid with fused F32x16 SIMD path.
Default burn sigmoid: 6 separate ops (neg, exp, add, log, neg, exp)
Our sigmoid: one fused pass: 1/(1+exp(-x)) via F32x16 polynomial
For contiguous f32: use hpc::activations::sigmoid_f32 (F32x16 SIMD)
For non-f32 or non-contiguous: decomposed via Backend float ops
The fused path eliminates 5 intermediate tensor allocations and
does the full sigmoid in a single pass over the data.
30 tests passing. Zero regressions.
https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o71 parent 984d50c commit 8d3f6bc
1 file changed
Lines changed: 28 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
18 | 45 | | |
0 commit comments