Commit 1ce26cf
authored
refactor(pt): fully refactor of HybridMuon optimizer (#5275)
1. refactor name-based routing
2. add slice mode for HybridMuon opt
3. add Magma-lite damping for Muon path
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* HybridMuon gains routing modes (slice, 2d, flat), name-aware routing
for biases/Adam variants, and a magma_muon option for Magma-lite
damping. Optimizer now accepts named parameters; deprecated 2D-only
options removed.
* **Documentation**
* Updated optimizer docs to describe new routing modes, magma_muon and
flash_muon options, and adjusted lr_adjust default.
* **Tests**
* Expanded tests for routing modes, Magma damping, and state
compatibility; some legacy tests consolidated.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->1 parent 9ed2d48 commit 1ce26cf
4 files changed
Lines changed: 889 additions & 347 deletions
File tree
- deepmd
- pt
- optimizer
- train
- utils
- source/tests/pt
0 commit comments