Fix launcher Slurm mounts in installed MCP mode by ChenhanYu · Pull Request #1811 · NVIDIA/Model-Optimizer

ChenhanYu · 2026-06-24T00:56:03Z

Summary

make Slurm ModelOpt source overlay mounts conditional on source-checkout mode
force managed-source MCP launches to reinstall modelopt-launcher from the selected checkout so source refs do not reuse a stale cached package
add regression coverage for installed-mode Slurm mounts and managed-source launcher argv construction

Root cause

PR #1799 correctly stopped packaging modules/Model-Optimizer/* when modelopt-launcher runs from an installed package. However, build_slurm_executor() still unconditionally added container mounts for code/modules/Model-Optimizer/modelopt and modelopt_recipes. In installed MCP mode those paths do not exist in the remote package, so the container runtime fails before the job script starts.

A second issue appeared during validation: the MCP managed-source path can materialize the right git checkout but still execute a cached modelopt-launcher package with the same version. Adding uv run --reinstall-package modelopt-launcher ensures the selected source ref is what actually runs.

Validation

uv run pytest tests/test_bridge.py -q from tools/mcp: 51 passed
uv run pytest tests/test_slurm_executor.py tests/test_core.py -q from tools/launcher: 24 passed
pre-commit run --files tools/launcher/core.py tools/launcher/tests/test_slurm_executor.py tools/mcp/modelopt_mcp/bridge.py tools/mcp/tests/test_bridge.py: passed
Live Slurm GPU smoke validated with patched launcher path; nvidia-smi ran successfully and the smoke script completed.

Summary by CodeRabbit

Release Notes

Refactor
- Restructured container mount assembly for Slurm job execution to conditionally mount ModelOpt directories based on optional source path parameter.
- Enhanced launcher command-line generation with package management improvements.
- Replaced unconditional mount paths with conditional behavior for more flexible resource utilization.
Tests
- Expanded container mount scenario test coverage for installed and source execution modes.
- Tightened test assertions for comprehensive mount behavior verification.

Signed-off-by: Chenhan Yu <chenhany@nvidia.com>

copy-pr-bot · 2026-06-24T00:56:06Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-06-24T00:56:11Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3383ae46-2aa1-4007-a7f3-7c31107da61f

📥 Commits

Reviewing files that changed from the base of the PR and between 1766d55 and 7b773a8.

📒 Files selected for processing (4)

tools/launcher/core.py
tools/launcher/tests/test_slurm_executor.py
tools/mcp/modelopt_mcp/bridge.py
tools/mcp/tests/test_bridge.py

📝 Walkthrough

Walkthrough

build_slurm_executor gains an optional modelopt_src_path parameter that controls whether ModelOpt container mounts are added; scratchspace and experiment-title mounts are now always included. run_jobs forwards this parameter. In the MCP bridge, _launcher_argv adds --reinstall-package modelopt-launcher to the managed-source uv run invocation.

Changes

Conditional ModelOpt Mounts in Slurm Executor

Layer / File(s)	Summary
`build_slurm_executor` conditional mount logic and forwarding `tools/launcher/core.py`	`build_slurm_executor` signature gains `modelopt_src_path=None`; `container_mounts` assembly unconditionally includes scratchspace and experiment-title mounts and conditionally adds `modelopt`/`modelopt_recipes` overlays when `modelopt_src_path` is set. `run_jobs` forwards `modelopt_src_path` into the call.
Slurm executor mount tests `tools/launcher/tests/test_slurm_executor.py`	Old single-mount test replaced with installed-mode (asserts ModelOpt overlays are absent) and source-mode (asserts ModelOpt overlays are present) cases; the `container_mounts=None` test strengthened to check specific scratchspace and experiment-title mounts.

MCP Bridge Reinstall Flag

Layer / File(s)	Summary
`_launcher_argv` and dry-run test `tools/mcp/modelopt_mcp/bridge.py`, `tools/mcp/tests/test_bridge.py`	`_launcher_argv` inserts `--reinstall-package modelopt-launcher` before `--project` in the `uv run` subprocess call; the dry-run test assertion is updated to match the extended argv prefix.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

NVIDIA/Model-Optimizer#1766: Both PRs modify tools/mcp/modelopt_mcp/bridge.py to change how the uv run subprocess argv is constructed for managed-source launcher invocation.

Suggested labels

cherry-pick-done, cherry-pick-0.45.0

Suggested reviewers

kevalmorabia97

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: making Slurm container mounts conditional based on MCP mode (installed vs. source), which is the primary fix addressed in this PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns	✅ Passed	No security anti-patterns found. Changes only add safe optional parameters (modelopt_src_path=None), static argv elements for uv reinstall, and conditional mount logic. No torch.load, numpy.load, t...

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chenhany/fix-launcher-installed-slurm-mounts

_{Comment @coderabbitai help to get the list of available commands.}

ChenhanYu · 2026-06-24T00:57:21Z

/claude review

claude

Claude review passed — no blocking issues found.

Traced the fix end-to-end:

launch.py sets MODELOPT_SRC_PATH=None in installed mode and a real path in source mode; this flows through run_jobs into build_slurm_executor, where the new if modelopt_src_path: guard correctly suppresses the modelopt/modelopt_recipes container overlay mounts that don't exist in the remote installed package — the root cause of the pre-job container failure.
Positional arg order in both run_jobs callers matches the updated build_slurm_executor/build_docker_executor signatures (new modelopt_src_path slot before experiment_title).
build_docker_executor behavior is unchanged (still defaults to cwd), correct for local/dev Docker.
_launcher_argv adds --reinstall-package modelopt-launcher only on the managed-source branch; valid uv run option placement, installed launches untouched. The per-launch reinstall cost is the documented intentional tradeoff to avoid running a stale cached package.

Test coverage exercises installed-skip, source-add, and the argv change. Risk is low — scope is confined to tools/ launcher/MCP infra with backward-compatible optional args.

LGTM

codecov · 2026-06-24T01:05:57Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.53%. Comparing base (1766d55) to head (7b773a8).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1811      +/-   ##
==========================================
- Coverage   56.84%   56.53%   -0.32%     
==========================================
  Files         510      510              
  Lines       56615    56615              
==========================================
- Hits        32183    32006     -177     
- Misses      24432    24609     +177

Flag	Coverage Δ
regression	`14.72% <ø> (+0.06%)`	⬆️
unit	`54.65% <ø> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-06-24T20:13:28Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-24 20:13 UTC

Fix launcher Slurm mounts in installed mode

7b773a8

Signed-off-by: Chenhan Yu <chenhany@nvidia.com>

ChenhanYu marked this pull request as ready for review June 24, 2026 00:57

ChenhanYu requested a review from a team as a code owner June 24, 2026 00:57

claude Bot approved these changes Jun 24, 2026

View reviewed changes

coderabbitai Bot approved these changes Jun 24, 2026

View reviewed changes

kevalmorabia97 approved these changes Jun 24, 2026

View reviewed changes

ChenhanYu merged commit 7c5741b into main Jun 24, 2026
46 checks passed

ChenhanYu deleted the chenhany/fix-launcher-installed-slurm-mounts branch June 24, 2026 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix launcher Slurm mounts in installed MCP mode#1811

Fix launcher Slurm mounts in installed MCP mode#1811
ChenhanYu merged 1 commit into
mainfrom
chenhany/fix-launcher-installed-slurm-mounts

ChenhanYu commented Jun 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

copy-pr-bot Bot commented Jun 24, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

ChenhanYu commented Jun 24, 2026

Uh oh!

claude Bot left a comment

Uh oh!

codecov Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ChenhanYu commented Jun 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Validation

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot Bot commented Jun 24, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

ChenhanYu commented Jun 24, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ChenhanYu commented Jun 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

codecov Bot commented Jun 24, 2026 •

edited

Loading