Skip to content

fix(executor): prevent CPU busy-spin when async jobs are pending#4833

Closed
ashnaaseth2325-oss wants to merge 7 commits intoboa-dev:mainfrom
ashnaaseth2325-oss:fix/job-executor-no-busy-wait-v2
Closed

fix(executor): prevent CPU busy-spin when async jobs are pending#4833
ashnaaseth2325-oss wants to merge 7 commits intoboa-dev:mainfrom
ashnaaseth2325-oss:fix/job-executor-no-busy-wait-v2

Conversation

@ashnaaseth2325-oss
Copy link
Copy Markdown

Summary

This PR fixes a CPU busy-spin in run_jobs_async affecting both:

  • core/engine/src/job.rs
  • cli/src/main.rs

Previously, the executor used:

future::poll_once(group.next()).await.flatten();
future::yield_now().await;

When any NativeAsyncJob future was pending (e.g., Atomics.waitAsync), this combination never truly suspended the thread. The loop re-polled continuously, causing sustained 100% CPU usage until the async job resolved.


Steps to Reproduce

const sab = new SharedArrayBuffer(4);
const ia  = new Int32Array(sab);
Atomics.waitAsync(ia, 0, 0, 3000);

Running via CLI would peg one CPU core at 100% for the full duration.


Fix

When no synchronous work is available, the executor now properly blocks on:

group.next().await;

Instead of busy-polling + yielding.


Result

  • CPU usage normalizes during async waits
  • Executor properly parks until a future completes
  • No behavioral or Test262 changes
  • Improved stability for CLI and embedders

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 3, 2026

Test262 conformance changes

Test result main count PR count difference
Total 52,963 52,963 0
Passed 49,687 49,687 0
Ignored 2,262 2,262 0
Failed 1,014 1,014 0
Panics 0 0 0
Conformance 93.81% 93.81% 0.00%

Tested main commit: b3d820369072123ded8defc927405b99b9012648
Tested PR commit: 25bb1ab6a996a0c61cfe0d3bc8b0f015998ab1d5
Compare commits: b3d8203...25bb1ab

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 3, 2026

Codecov Report

❌ Patch coverage is 60.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.19%. Comparing base (6ddc2b4) to head (0bd80cb).
⚠️ Report is 729 commits behind head on main.

Files with missing lines Patch % Lines
cli/src/main.rs 0.00% 4 Missing ⚠️
core/engine/src/job.rs 81.81% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4833      +/-   ##
==========================================
+ Coverage   47.24%   57.19%   +9.94%     
==========================================
  Files         476      554      +78     
  Lines       46892    60576   +13684     
==========================================
+ Hits        22154    34645   +12491     
- Misses      24738    25931    +1193     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ashnaaseth2325-oss
Copy link
Copy Markdown
Author

Hello @jedel1043,
This PR prevents CPU busy-spinning in the executor when async jobs are pending.
All CI checks are passing and conformance is unchanged.
Please let me know if any refinements are needed.
Thanks!

Comment on lines +752 to +757
if no_sync_work && !group.is_empty() {
if let Some(Err(err)) = group.next().await {
self.clear();
return Err(err);
}
} else if let Some(Err(err)) = future::poll_once(group.next()).await.flatten() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kinda fails if you have a long enough timeout, right? Something like setTimeout(f, 10000) would make it such that the group.next().await path is never taken.

@ashnaaseth2325-oss
Copy link
Copy Markdown
Author

Hello @jedel1043 ,
Thanks for the catch! I’ve updated the logic so we only block when no timeout jobs are actually due to run.
All other CI checks are passing now. It looks like the test262 job hit the 60-minute timeout and was automatically cancelled, but it was progressing normally with no failures before that.
Please let me know if you’d like me to re-run it.

@jedel1043
Copy link
Copy Markdown
Member

Gonna add this to #4891 and add you as a co-author, since it's simpler to not exit on interval jobs.

@ashnaaseth2325-oss
Copy link
Copy Markdown
Author

Sounds good, thanks! Glad to see the change integrated there.

@jedel1043
Copy link
Copy Markdown
Member

Wait, this is kinda wrong. Since we take &RefCell<&mut Context> as argument, any other caller that also captures that can also schedule jobs. The event loop is not "isolated", so awaiting here would block the whole event loop if the pending async jobs take long enough.
Here I have a repro of what I'm talking about:

#[test]
fn test_async_job_not_blocking_event_loop() {
    let clock = Rc::new(FixedClock::default());
    let context = &mut ContextBuilder::default()
        .clock(clock.clone())
        .build()
        .unwrap();

    run_test_actions_with(
        [TestAction::inspect_context_async(async move |ctx| {
            let executor = ctx.downcast_job_executor::<SimpleJobExecutor>().unwrap();
            let ctx = &RefCell::new(ctx);

            let mut event_loop = pin!(future::poll_once(executor.run_jobs_async(ctx)));

            // There are no jobs in our queue. Push
            // an async job that will consistently yield to the executor.
            ctx.borrow_mut().enqueue_job(
                NativeAsyncJob::new(async |_| {
                    loop {
                        future::yield_now().await;
                    }
                })
                .into(),
            );

            // Then, start the event loop
            assert!(event_loop.as_mut().await.is_none());

            let checker = Rc::new(Cell::new(false));
            {
                let checker = checker.clone();
                // At this point, the event loop should have yielded again to the async executor.
                // Thus, enqueue a generic job that should resolve in the next loop.
                let realm = ctx.borrow().realm().clone();
                ctx.borrow_mut().enqueue_job(
                    GenericJob::new(
                        move |_| {
                            checker.set(true);
                            Ok(JsValue::undefined())
                        },
                        realm,
                    )
                    .into(),
                );
            }

            // Next iteration of the event loop
            assert!(event_loop.as_mut().await.is_none());

            // At this point, our generic job should have been executed.
            assert!(checker.get());
        })],
        context,
    );
}

Since the event loop blocks awaiting the async job, the generic job never gets executed.

github-merge-queue bot pushed a commit that referenced this pull request Mar 6, 2026
After seeing the [avalanche](#4886)
[of](#4833)
[issues](#4785)
[that](#4749)
[our](#4782) executors have, I
don't think it's worth being too smart about when to exit the loop.
Folks who wanna have that behaviour can just implement their own
`JobExecutor`.

Thus, this reverts the implementation to the old behaviour of blocking
when having pending timeout and interval jobs. Fortunately,
`run_jobs_async` exists, so we can use it in our CLI to intersperse
executing jobs with executing parsing and execution.

And just for good measure, it also adds a `stop` cancellation token to
`SimpleJobExecutor`, in case folks still want to use it but that also
want to remotely stop the event loop's execution from another thread.
ashnaaseth2325-oss added 7 commits March 7, 2026 19:10
Replace poll_once + yield_now with a conditional await: when no
synchronous jobs are ready, suspend on group.next() until a
NativeAsyncJob future resolves instead of spinning at 100% CPU.

Signed-off-by: ashnaaseth2325-oss <ashnaaseth2325@gmail.com>
Signed-off-by: ashnaaseth2325-oss <ashnaaseth2325@gmail.com>
@jedel1043
Copy link
Copy Markdown
Member

Closing in favour of #4994.

@jedel1043 jedel1043 closed this Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants