Add deepseekv32 model fix#4507
Conversation
Conflict resolution: - aoa_config_base.py: use develop's version entirely - model_utils.py: keep develop's dtype aoa logic (with is_fleet guard) - gpt_provider.py: add mscale_all_dim extraction with safe key check - training_args.py: keep both dsa_indexer_loss_coeff and develop's new fields - template.py: keep both deepseek_v32 and glm_ocr templates Adapt deepseek v3.2 to develop's API: - Rename moe_grouped_gemm -> moe_expert_fusion in provider - Add multi_latent_attention=True and use_qk_norm=True to config - Wire up gen_inv_aoa_config from base class
|
Thanks for your contribution! |
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (11.57%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #4507 +/- ##
==========================================
Coverage ? 46.39%
==========================================
Files ? 478
Lines ? 90760
Branches ? 0
==========================================
Hits ? 42108
Misses ? 48652
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
PaddleFormers Log Analysis
日志分析报告
失败的测试case: 根本原因分析: 修复建议:
🔄 每次 Re-run 后自动更新 |
|
/re-run all-failed |
Before submitting
testsfolder. If there are codecov issues, please add tests cases first.PR types
PR changes
Description