Commit a742d4b
committed
Enalbe fused softmax/sigmoid + topk path for 1024 experts
Per measuring, the fused path delivers better performance when the number of experts is 1024.
1 token + 1024 experts: average uplift ~3%
64 tokens + 1024 experts: average uplift ~6%
128 tokens + 1024 experts: average uplift ~7%
256 tokens + 1024 experts: average uplift ~45%
Current MoE models do not yet support as many as 1024 experts. However, when customers compare performance at 1024 experts, this optimization can provide better performance metrics.
Signed-off-by: LiJianyu <jianyu.li@intel.com>1 parent 3c03f84 commit a742d4b
1 file changed
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
738 | 738 | | |
739 | 739 | | |
740 | 740 | | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
741 | 745 | | |
742 | 746 | | |
743 | 747 | | |
| |||
0 commit comments