-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(offload): reset activation offload manager after eval as well as …
#3739
opened Mar 6, 2026 by
rapatel
Loading…
5 tasks
revert of #2658
complexity: low
Expert Review
Apply this label to indicate that your PR is ready for expert review.
[dev] fix(dev): Customized attention mask for TE backend core attention.
#3732
opened Mar 6, 2026 by
yuzhongw-nvidia
•
Draft
5 tasks
Prefix caching mamba v3 ssm initial states
#3726
opened Mar 5, 2026 by
lmcafee-nvidia
•
Draft
5 tasks
Enhance optimizer state loading with runtime overrides
community-request
#3720
opened Mar 5, 2026 by
yhgalaxy
Loading…
6 tasks
rename hybrid-cp to dynamic-cp and fix for rope when enabling THD + Dynamic-CP
#3717
opened Mar 5, 2026 by
xiaoyao0115
Loading…
Scaling fixes for MuP over Muon optimizer.
community-request
Expert Review
Apply this label to indicate that your PR is ready for expert review.
Final Review
PR is in the "final review" stage
#3715
opened Mar 5, 2026 by
plugyawn
Loading…
4 of 6 tasks
Normalize tool_calls and gate parser tool-calls to tool-enabled requests
complexity: low
#3710
opened Mar 4, 2026 by
i-riyad
Loading…
6 tasks
find optimal number of workers
complexity: medium
Expert Review
Apply this label to indicate that your PR is ready for expert review.
[Main][feat] Support CUDA Graph capture offloading modules
complexity: medium
enhancement
New feature or request
fix ddp bug when --overlap-grad-reduce and --num-distributed-optimi for dev
#3694
opened Mar 4, 2026 by
wplf
Loading…
6 tasks
fix ddp bug when --overlap-grad-reduce and --num-optim > 1
Final Review
PR is in the "final review" stage
#3693
opened Mar 4, 2026 by
wplf
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.