Skip to content

update muon_sharding_optimizer with rebuilding 2d_params.#78814

Merged
sneaxiy merged 1 commit into
PaddlePaddle:developfrom
xxyux:rebuild_2d
Apr 29, 2026
Merged

update muon_sharding_optimizer with rebuilding 2d_params.#78814
sneaxiy merged 1 commit into
PaddlePaddle:developfrom
xxyux:rebuild_2d

Conversation

@xxyux
Copy link
Copy Markdown
Contributor

@xxyux xxyux commented Apr 27, 2026

PR Category

Execute Infrastructure

PR Types

Improvements

Description

This PR introduces 2 main changes to refine the Muon optimizer implementation:

  1. Refactor 2D parameter partition logic:
    Removed hardcoded attributes specialized for specific colors (e.g., self._params_2d, self._params_2d_moe, self._rank2params_2d). Replaced them with a generalized loop (for color_key, params_2d in self._params_2d_by_color.items():) to make the color grouping mechanism template-based and highly extensible.
  2. Simplify fused gradient control:
    Removed the reliance on the external environment variable FLAGS_shard_fused_gradient. The logic is now dynamically and elegantly controlled by the internal attribute self._use_fuse_gradients = self.comm_buffer_size_MB > 0.
  3. Add param_clear & param_set func for fp8_fixed_training

是否引起精度变化

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Apr 27, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 27, 2026

Codecov Report

❌ Patch coverage is 75.67568% with 9 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@1561597). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...d/fleet/meta_optimizers/muon_sharding_optimizer.py 74.28% 9 Missing ⚠️

❌ Your patch status has failed because the patch coverage (75.67%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #78814   +/-   ##
==========================================
  Coverage           ?   75.67%           
==========================================
  Files              ?        2           
  Lines              ?       37           
  Branches           ?        0           
==========================================
  Hits               ?       28           
  Misses             ?        9           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@umiswing
Copy link
Copy Markdown
Member

/re-run all-failed

1 similar comment
@xxyux
Copy link
Copy Markdown
Contributor Author

xxyux commented Apr 28, 2026

/re-run all-failed

@xxyux
Copy link
Copy Markdown
Contributor Author

xxyux commented Apr 28, 2026

/re-run all-failed

GuoxiaWang
GuoxiaWang previously approved these changes Apr 28, 2026
sneaxiy
sneaxiy previously approved these changes Apr 28, 2026
@xxyux xxyux dismissed stale reviews from sneaxiy and GuoxiaWang via 45284e9 April 28, 2026 19:35
@xxyux xxyux force-pushed the rebuild_2d branch 3 times, most recently from 0115493 to 2e0fe37 Compare April 28, 2026 19:44
@xxyux
Copy link
Copy Markdown
Contributor Author

xxyux commented Apr 29, 2026

/re-run all-failed

2 similar comments
@xxyux
Copy link
Copy Markdown
Contributor Author

xxyux commented Apr 29, 2026

/re-run all-failed

@xxyux
Copy link
Copy Markdown
Contributor Author

xxyux commented Apr 29, 2026

/re-run all-failed

Copy link
Copy Markdown
Collaborator

@sneaxiy sneaxiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for coverage.

@sneaxiy sneaxiy merged commit 3d3826f into PaddlePaddle:develop Apr 29, 2026
245 of 269 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants