Add scatter/gather kernels for MoE pipeline with fused weighted gather #2467
| Job | Run time |
|---|---|
| 6m 42s | |
| 3m 7s | |
| 11s | |
| 2m 17s | |
| 57s | |
| 16s | |
| 3m 7s | |
| 14s | |
| 3m 6s | |
| 3m 6s | |
| 3m 7s | |
| 2m 51s | |
| 4m 19s | |
| 2m 9s | |
| 3m 28s | |
| 4m 4s | |
| 4m 30s | |
| 3m 33s | |
| 4m 30s | |
| 3m 41s | |
| 3m 35s | |
| 4m 35s | |
| 4m 8s | |
| 4m 46s | |
| 2m 9s | |
| 1m 50s | |
| 4m 35s | |
| 3m 30s | |
| 7m 19s | |
| 5m 54s | |
| 4m 1s | |
| 2m 43s | |
| 3m 18s | |
| 2m 15s | |
| 5m 35s | |
| 2m 38s | |
| 2m 38s | |
| 3m 38s | |
| 6m 7s | |
| 5m 15s | |
| 5m 48s | |
| 6m 33s | |
| 3m 56s | |
| 5m 46s | |
| 3m 26s | |
| 0s | |
| 0s | |
| 0s | |
| 0s | |
| 2h 45m 13s |