We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent becc4b7 commit 5f50680Copy full SHA for 5f50680
1 file changed
training/deepspeed_finetune_demo/README.md
@@ -121,8 +121,6 @@ AutoEP config goes inside the DeepSpeed JSON under `expert_parallel`:
121
| `route_scale` | Router output scaling factor (should match `routed_scaling_factor` in model config) |
122
| `load_balance_coeff` | Auxiliary load-balancing loss coefficient (`null` to disable) |
123
124
-Note: `route_scale` and expert group settings can be auto-filled from the HF model config if using DeepSpeed branch `gma/autoep-muon-fixes`.
125
-
126
# Benchmarking
127
128
To run benchmark, run:
0 commit comments