Guard standalone termination against zero job id#931
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Enterprise Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughIn ChangesPID Validation in Kill Command
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
68ca9fb to
7130435
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cloudai/systems/standalone/standalone_system.py`:
- Around line 81-90: The job ID validation using string-equality check `== "0"`
only blocks the exact string and allows variants like "00", "+0", and "-0" to
pass through, which then parse as integer 0 and signal the entire process group
instead of a specific process. Replace the string comparison with explicit
integer parsing and validation: convert job.id to an integer, validate it is a
positive number (greater than 0), and only proceed with the kill command if that
validation passes. This ensures that all zero variants and invalid PID values
are rejected before constructing the cmd string and calling cmd_shell.execute.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: f5b3c7a8-35f8-4530-bdda-63d1d3531de9
📒 Files selected for processing (2)
src/cloudai/systems/standalone/standalone_system.pytests/systems/standalone/test_system.py
Signed-off-by: shreyaskommuri <shreyaskommuri@gmail.com>
7130435 to
030ac94
Compare
|
Addressed CodeRabbit feedback by parsing standalone job IDs as integers before constructing the kill command. The guard now skips non-numeric IDs and all non-positive PID variants such as |
Summary
job.id == 0does not shell out tokill -9 0.Context
Fixes #929.
While using CloudAI as the execution layer for CloudAI Autotune, a wrapper smoke test hit the common standalone sleep dry-run path. CloudAI scheduled dependency termination for job
0, which producedkill -9 0. On POSIX shells that targets the current process group, so the parent wrapper was killed and exited with code137before it could report a normal result.In standalone dry-run mode,
StandaloneRunner._submit_test()usesjob_id = 0because no subprocess is launched. That sentinel should not be passed to the shell as a real kill target.Test Plan
python -m pytest tests/systems/standalone/test_system.py -q5 passedpython -m pytest -q1574 passed, 4 skipped, 463 deselecteduv run pre-commit run --all-filesI also attempted the live standalone sleep dry-run on a fresh
origin/mainbranch, but that path is currently blocked earlier by the already-filed abstract-method issue #920 / PR #921. This PR keeps the scope to the zero-job-id termination fix from #929.