fix(executor): prevent CPU busy-spin when async jobs are pending#4833
fix(executor): prevent CPU busy-spin when async jobs are pending#4833ashnaaseth2325-oss wants to merge 7 commits intoboa-dev:mainfrom
Conversation
Test262 conformance changes
Tested main commit: |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4833 +/- ##
==========================================
+ Coverage 47.24% 57.19% +9.94%
==========================================
Files 476 554 +78
Lines 46892 60576 +13684
==========================================
+ Hits 22154 34645 +12491
- Misses 24738 25931 +1193 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hello @jedel1043, |
core/engine/src/job.rs
Outdated
| if no_sync_work && !group.is_empty() { | ||
| if let Some(Err(err)) = group.next().await { | ||
| self.clear(); | ||
| return Err(err); | ||
| } | ||
| } else if let Some(Err(err)) = future::poll_once(group.next()).await.flatten() { |
There was a problem hiding this comment.
This kinda fails if you have a long enough timeout, right? Something like setTimeout(f, 10000) would make it such that the group.next().await path is never taken.
|
Hello @jedel1043 , |
|
Gonna add this to #4891 and add you as a co-author, since it's simpler to not exit on interval jobs. |
|
Sounds good, thanks! Glad to see the change integrated there. |
|
Wait, this is kinda wrong. Since we take #[test]
fn test_async_job_not_blocking_event_loop() {
let clock = Rc::new(FixedClock::default());
let context = &mut ContextBuilder::default()
.clock(clock.clone())
.build()
.unwrap();
run_test_actions_with(
[TestAction::inspect_context_async(async move |ctx| {
let executor = ctx.downcast_job_executor::<SimpleJobExecutor>().unwrap();
let ctx = &RefCell::new(ctx);
let mut event_loop = pin!(future::poll_once(executor.run_jobs_async(ctx)));
// There are no jobs in our queue. Push
// an async job that will consistently yield to the executor.
ctx.borrow_mut().enqueue_job(
NativeAsyncJob::new(async |_| {
loop {
future::yield_now().await;
}
})
.into(),
);
// Then, start the event loop
assert!(event_loop.as_mut().await.is_none());
let checker = Rc::new(Cell::new(false));
{
let checker = checker.clone();
// At this point, the event loop should have yielded again to the async executor.
// Thus, enqueue a generic job that should resolve in the next loop.
let realm = ctx.borrow().realm().clone();
ctx.borrow_mut().enqueue_job(
GenericJob::new(
move |_| {
checker.set(true);
Ok(JsValue::undefined())
},
realm,
)
.into(),
);
}
// Next iteration of the event loop
assert!(event_loop.as_mut().await.is_none());
// At this point, our generic job should have been executed.
assert!(checker.get());
})],
context,
);
}Since the event loop blocks awaiting the async job, the generic job never gets executed. |
After seeing the [avalanche](#4886) [of](#4833) [issues](#4785) [that](#4749) [our](#4782) executors have, I don't think it's worth being too smart about when to exit the loop. Folks who wanna have that behaviour can just implement their own `JobExecutor`. Thus, this reverts the implementation to the old behaviour of blocking when having pending timeout and interval jobs. Fortunately, `run_jobs_async` exists, so we can use it in our CLI to intersperse executing jobs with executing parsing and execution. And just for good measure, it also adds a `stop` cancellation token to `SimpleJobExecutor`, in case folks still want to use it but that also want to remotely stop the event loop's execution from another thread.
Replace poll_once + yield_now with a conditional await: when no synchronous jobs are ready, suspend on group.next() until a NativeAsyncJob future resolves instead of spinning at 100% CPU. Signed-off-by: ashnaaseth2325-oss <ashnaaseth2325@gmail.com>
Signed-off-by: ashnaaseth2325-oss <ashnaaseth2325@gmail.com>
|
Closing in favour of #4994. |
Summary
This PR fixes a CPU busy-spin in
run_jobs_asyncaffecting both:core/engine/src/job.rscli/src/main.rsPreviously, the executor used:
When any
NativeAsyncJobfuture was pending (e.g.,Atomics.waitAsync), this combination never truly suspended the thread. The loop re-polled continuously, causing sustained 100% CPU usage until the async job resolved.Steps to Reproduce
Running via CLI would peg one CPU core at 100% for the full duration.
Fix
When no synchronous work is available, the executor now properly blocks on:
Instead of busy-polling + yielding.
Result