Skip to content

Pull requests: NVIDIA-NeMo/Megatron-Bridge

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[training] fix: use int64 for TrainState counters to prevent overflow
#3312 opened Apr 14, 2026 by yaoyu-33 Contributor Loading…
1 of 2 tasks
perf(nsys): reduce CPU-side overhead in profiling defaults
#3311 opened Apr 13, 2026 by dingqingy-nv Contributor Draft
3 tasks
[ckpt] feat: support MSC for fsdp_dtensors area:ckpt Checkpoint conversion, loading, export, and save paths community-request ready-to-merge PR is approved, current, and only waiting for CI to pass before merge
#3300 opened Apr 13, 2026 by pavelgein Contributor Loading…
3 of 5 tasks
chore(beep boop 🤖): Bump uv.lock (main, mcore-dev) (2026-04-13) area:build Dependencies, packaging, images, and environment setup full-test-suite needs-review PR is ready for code review and waiting on a reviewer
#3297 opened Apr 13, 2026 by svcnvidia-nemo-ci Contributor Loading…
[doc] feat: add MiniMax M2.5 / M2.7 model support docs-only With great power comes great responsibility. needs-review PR is ready for code review and waiting on a reviewer
#3291 opened Apr 13, 2026 by yaoyu-33 Contributor Loading…
4 tasks done
[ckpt] fix: Prevent int32 overflow for high train sample counts. area:ckpt Checkpoint conversion, loading, export, and save paths bug Something isn't working community-request needs-author Author action is required before review or merge can continue
#3290 opened Apr 12, 2026 by BlueCrescent Draft
[model] feat: Add YARN support for mamba_model from MCORE area:model Model implementations and HF bridge logic feature New capabilities, enhancements, or enablement work needs-author Author action is required before review or merge can continue
#3289 opened Apr 12, 2026 by guihong-nv Draft
3 of 5 tasks
feat(perf): switch llama3 70B GB200/GB300 MXFP8 from 3D parallel to FSDP area:perf Performance optimizations and benchmarking needs-author Author action is required before review or merge can continue performance
#3284 opened Apr 12, 2026 by dingqingy-nv Contributor Draft
[test] refactor: move diffusion tests to test_groups directory area:diffusion DFM module r0.4.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge. ready-to-merge PR is approved, current, and only waiting for CI to pass before merge
#3275 opened Apr 10, 2026 by huvunvidia Contributor Loading…
2 tasks
docs: expand golden values update strategy in release process guide docs-only With great power comes great responsibility. needs-review PR is ready for code review and waiting on a reviewer
#3268 opened Apr 10, 2026 by ko3n1g Contributor Loading…
2 tasks
chore(beep boop 🤖): Bump uv.lock (main, mcore-dev) (2026-04-10) area:build Dependencies, packaging, images, and environment setup full-test-suite needs-review PR is ready for code review and waiting on a reviewer
#3264 opened Apr 10, 2026 by svcnvidia-nemo-ci Contributor Loading…
[bridge] feat: Add ERNIE 4.5 text-only MoE and VL bridges for Megatro… area:model Model implementations and HF bridge logic community-request
#3263 opened Apr 10, 2026 by bo-ke Loading…
cp: [perf] fix: guard cuda_graph_scope validation against None (3249) into r0.4.0 area:perf Performance optimizations and benchmarking cherry-pick needs-author Author action is required before review or merge can continue Run CICD
#3262 opened Apr 10, 2026 by svcnvidia-nemo-ci Contributor Draft
Comment out dfm imports area:recipe Training recipes and launch configs bug Something isn't working needs-author Author action is required before review or merge can continue needs-follow-up Issue needs follow-up needs-review PR is ready for code review and waiting on a reviewer
#3233 opened Apr 9, 2026 by gautham-kollu Contributor Loading…
5 tasks
Adding condition for how nccl_chunk_size var is being set as currentl… area:perf Performance optimizations and benchmarking bug Something isn't working docs-only With great power comes great responsibility. needs-author Author action is required before review or merge can continue ready-to-merge PR is approved, current, and only waiting for CI to pass before merge
#3228 opened Apr 8, 2026 by rsalagame-nvidia Contributor Loading…
Adding condition for how nccl_chunk_size var is being set as currentl… area:perf Performance optimizations and benchmarking bug Something isn't working needs-author Author action is required before review or merge can continue
#3226 opened Apr 8, 2026 by rsalagame-nvidia Contributor Loading…
Expose fully_parallel_save in save_megatron_model area:ckpt Checkpoint conversion, loading, export, and save paths community-request feature New capabilities, enhancements, or enablement work needs-author Author action is required before review or merge can continue needs-follow-up Issue needs follow-up needs-more-tests Requires additional L0 and L1 test coverage before merge ready-to-merge PR is approved, current, and only waiting for CI to pass before merge
#3207 opened Apr 8, 2026 by nic-nvidia Loading…
1 of 2 tasks
[misc] feat: Add NVTX ranges for train_step and optimizer_step area:misc Cross-cutting utilities, logging, helpers, and other changes feature New capabilities, enhancements, or enablement work
#3198 opened Apr 7, 2026 by minitu Draft
5 tasks
fix(mimo): Scale encoder gradients for heterogeneous DP in multimodule finalization area:training Training loop, callbacks, and runtime integration bug Something isn't working needs-review PR is ready for code review and waiting on a reviewer
#3197 opened Apr 7, 2026 by kamran-nvidia Contributor Loading…
5 tasks
Enable nemo-ci tests (short runs - perf and non-perf) for Wan + Updating recipes names area:diffusion DFM module area:recipe Training recipes and launch configs ci CI, automation, test queue, or workflow infrastructure work
#3179 opened Apr 6, 2026 by huvunvidia Contributor Loading…
5 tasks
ci: Add FLUX/diffusion support to scripts/performance/run_recipe.py area:diffusion DFM module ci CI, automation, test queue, or workflow infrastructure work needs-review PR is ready for code review and waiting on a reviewer
#3176 opened Apr 6, 2026 by suiyoubi Contributor Loading…
5 tasks
Update Qwen3 235B B300 to include VP area:perf Performance optimizations and benchmarking
#3175 opened Apr 6, 2026 by rhmukundan Contributor Draft
ProTip! Adding no:label will show everything without a label.