You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
refactor: unify learning rate schedulers with array API (#5154)
- Refactor BaseLR in dpmodel to use array_api_compat for
backend-agnostic implementation
- Consolidate learning rate logic from TF/PT/PD backends into unified
dpmodel layer
- Use array API operations (xp.where, xp.clip, etc.) for JIT
compatibility across backends
- Add warmup support (warmup_steps, warmup_ratio, warmup_start_factor)
during refactoring
- Add stop_ratio parameter as alternative to stop_lr for flexible
configuration
- Implement mutual exclusion validation for stop_lr/stop_ratio and
warmup_steps/warmup_ratio
- Update all backends to use unified BaseLR implementation
- Add comprehensive consistency tests across
NumPy/PyTorch/JAX/array_api_strict backends
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Added comprehensive warmup support for learning rate schedules with
configurable warmup steps, ratios, and start factors.
* Enhanced learning rate scheduling with unified configuration across
TensorFlow, PyTorch, and Paddle backends.
* Introduced flexible stop learning rate configuration using either
absolute values or ratios.
* **Improvements**
* Moved warmup configuration from training to learning rate settings for
consistency.
* Added automatic migration of legacy warmup settings for backward
compatibility.
* Expanded cosine annealing schedule support with proper warmup
integration.
* **Documentation**
* Added comprehensive learning rate scheduling documentation with
examples.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
0 commit comments