Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

RL support for nanov3 sft checkpoint
#3741 opened Mar 6, 2026 by jon-barker Loading…
5 tasks
Continue emerging optimizer refactoring
#3737 opened Mar 6, 2026 by skyw Draft
6 tasks
revert of #2658 complexity: low Expert Review Apply this label to indicate that your PR is ready for expert review.
#3736 opened Mar 6, 2026 by dimapihtar Loading…
5 tasks
Core 0.16
Add TP2 FSDP test Run functional tests
#3731 opened Mar 6, 2026 by gautham-kollu Loading…
5 tasks
Core 0.16
Add parser arguments to RL inference
#3722 opened Mar 5, 2026 by ArEsKay3 Draft
6 tasks
chore: Switch to ruff for linting
#3719 opened Mar 5, 2026 by ahmadki Draft
6 tasks
add warning for cp > 1 and not per token loss dev
#3716 opened Mar 5, 2026 by wplf Loading…
Scaling fixes for MuP over Muon optimizer. community-request Expert Review Apply this label to indicate that your PR is ready for expert review. Final Review PR is in the "final review" stage
#3715 opened Mar 5, 2026 by plugyawn Loading…
4 of 6 tasks
Add warning for cp_size > 1 and no per token loss
#3714 opened Mar 5, 2026 by wplf Loading…
Extract the changes from Jorge's branch
#3701 opened Mar 4, 2026 by tdene Draft
6 tasks
find optimal number of workers complexity: medium Expert Review Apply this label to indicate that your PR is ready for expert review.
#3699 opened Mar 4, 2026 by dimapihtar Loading…
6 tasks
Core 0.16
Fix config.softmax_scale not being considered Final Review PR is in the "final review" stage
#3698 opened Mar 4, 2026 by janEbert Loading… Core 0.16
fix ddp bug when --overlap-grad-reduce and --num-optim > 1 Final Review PR is in the "final review" stage
#3693 opened Mar 4, 2026 by wplf Loading…
ProTip! no:milestone will show everything without a milestone.