[AutoTuner] Fix flaky resume check test#3966
Open
harsh-kumar-patwa wants to merge 1 commit intoThe-OpenROAD-Project:masterfrom
Open
[AutoTuner] Fix flaky resume check test#3966harsh-kumar-patwa wants to merge 1 commit intoThe-OpenROAD-Project:masterfrom
harsh-kumar-patwa wants to merge 1 commit intoThe-OpenROAD-Project:masterfrom
Conversation
Contributor
Author
|
@luarss @vvbandeira @maliberty This PR fixes the flaky resume check test that has been disabled since issue #3005. I reviewed the feedback on the draft PR #3070 and used a different approach based on Ray Tune's ExperimentAnalysis to poll for experiment status instead of using a fixed sleep. Would appreciate your review. |
Replace the fixed time.sleep(120) with ExperimentAnalysis-based polling to reliably detect when trials complete before stopping the initial run. This addresses the flakiness reported in issue The-OpenROAD-Project#3005 and the review feedback from draft PR The-OpenROAD-Project#3070. Key changes: - Use Ray Tune ExperimentAnalysis to poll experiment status instead of fixed sleep - Add managed_process context manager for safe subprocess cleanup - Add stop_ray_cluster helper that retries until Ray shuts down cleanly - Re-enable the resume check test in test_autotuner.sh Signed-off-by: Harsh <harshkumar3446@gmail.com> Signed-off-by: Harsh Kumar <harshkumar3446@gmail.com>
bf7779d to
a052483
Compare
Contributor
|
Tests seem to be failing |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
time.sleep(120)in the resume check test with a polling approach using Ray Tune'sExperimentAnalysisto detect when trials completemanaged_processcontext manager for safe subprocess cleanup andstop_ray_clusterhelper that retries until Ray shuts down cleanlytest_autotuner.shCloses #3005
This addresses the review feedback from the draft PR #3070 by @vvbandeira. The main improvements over that draft are:
ExperimentAnalysisto check experiment status instead of a fixed sleep, making the test reliable across different hardwareTest plan
tools.AutoTuner.test.resume_check.ResumeCheck.test_tune_resume)