-
Notifications
You must be signed in to change notification settings - Fork 198
Introduce a CancellationToken for cancelling specific computations
#1007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for salsa-rs canceled.
|
CodSpeed Performance ReportMerging #1007 will degrade performance by 8.36%Comparing Summary
Benchmarks breakdown
|
|
Looks like allocator noise from the new arc alloc |
|
Interesting. Is the idea to cancel a single thread rather than all threads? Won't this cancell all threads that currently block on any query executing on the thread being cancelled? |
Yes, the thought is that this could allow implementing client side request cancellation for LSP requests for example.
That is the thing I will have to investigate. We unwind with the payload, we don't panic so I believe this does not set the panicking state of the thread actually. So this might actually just work (though I doubt it). |
Oh, so other threads would reclaim and re-execute the queries then. Interesting. |
|
That would be the ideal I think |
db715b7 to
47b7a7d
Compare
|
I was wrong, unwinding does set the panicking flag after all (which probably makes more sense ...) hmm |
f97346e to
de32cab
Compare
ba0f832 to
2318de4
Compare
|
Would mind explaining the approach in the pr summary. I'm mainly interested in how it works if other threads are blocked on a query that gets cancelled (including if they participate in a cycle) |
f26560a to
2415a00
Compare
| - name: Test with Miri | ||
| run: cargo miri nextest run --no-fail-fast --tests | ||
| env: | ||
| MIRIFLAGS: -Zmiri-disable-isolation -Zmiri-retag-fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a default now
6e14f3d to
7d927c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a CancellationToken that allows cancelling specific database computations instead of the entire runtime. The token is stored on ZalsaLocal and enables fine-grained cancellation control, with special handling during cycle fixpoint iteration where local cancellation is disabled to ensure cycles complete successfully.
Key changes:
- Added
CancellationTokentype with atomic state management for local cancellation - Differentiated
Cancelled::LocalfromCancelled::PendingWriteto distinguish cancellation types - Modified blocking behavior so threads blocked on locally-cancelled threads recompute instead of propagating cancellation
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/zalsa_local.rs |
Implements CancellationToken with atomic state management and integrates it into ZalsaLocal |
src/zalsa.rs |
Updates cancellation checks to handle both local and pending-write cancellation |
src/database.rs |
Adds public cancellation_token() API method |
src/cancelled.rs |
Adds Cancelled::Local variant for local cancellation |
src/attach.rs |
Resets cancellation state when exiting database TLS |
src/storage.rs |
Ensures proper cleanup of zalsa_local when converting to handle |
src/runtime.rs |
Adds WaitResult::Cancelled, renames revision_canceled to revision_cancelled |
src/function/sync.rs |
Reorders ClaimResult variants, adds zalsa_local parameter, handles cancelled release |
src/function/fetch.rs |
Returns None on local cancellation to trigger recomputation |
src/function/memo.rs |
Converts local cancellation to panic in fixpoint, fixes comment typo |
src/function/execute.rs |
Disables local cancellation during fixpoint iteration |
src/function/maybe_changed_after.rs |
Updates blocking to discard return value |
src/interned.rs |
Adds #[allow(dead_code)] for feature-gated field |
tests/cancellation_token.rs |
Basic cancellation token test |
tests/parallel/cancellation_token_*.rs |
Comprehensive parallel cancellation tests |
tests/dataflow.rs |
Refactors Box<[usize]> to BTreeSet<usize> |
tests/interned-revisions.rs |
Adjusts iteration counts for miri/non-miri |
.github/workflows/test.yml |
Removes -Zmiri-retag-fields flag |
tests/parallel/main.rs |
Adds new test module declarations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7d927c2 to
8a82499
Compare
8a82499 to
b2e2844
Compare
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably makes sense to have this in Salsa but is an implementation outside Salsa considerably more work?
src/function/memo.rs
Outdated
| // We cannot handle local cancellations in fixpoints | ||
| // so we treat it as a general cancellation / panic. | ||
| // | ||
| // We shouldn't hit this though as we disable local cancellation | ||
| // in cycles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
block_on_heads isn't used by fixpoint anymore. It's only used by the FallbackImmediate cycle recovery strategy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is FallbackImmediate even necessary anymore? iirc you've removed the requirement of fixpoint having to converge right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether it's still necessary depends on r-a. We aren't using it in ty
iirc you've removed the requirement of fixpoint having to converge right?
No, fixpoint queries still have to converge. I only changed how we run nested queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah i mixed it up with #991
| if let Some(attached) = self.state { | ||
| if let Some(prev) = attached.database.replace(self.prev) { | ||
| // SAFETY: `prev` is a valid pointer to a database. | ||
| unsafe { prev.as_ref().zalsa_local().uncancel() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if prev was cancelled too? Which is possible because CancellationToken is Clone.
Should we instead simply consider that db as "dead". Meaning, you have to call Clone to get a new Db afterwards (What you do is probably the opposite, you clone the Db before passing it to a request handler, and the db is later dropped)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If prev was cancelled too then the query that initially attached it will reset the state once it exits its scope. We always put the databases back that we replaced in attached when the scope ends due to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think I got confused by the variable being named prev. I assumed it's DbGuard.prev when it is the database stored in attached.database (which isn't prev but the current db up to when we started dropping).
I think there's still a risk that we uncancel if we have something like a -> b -> a and a gets cancelled. I understand that we call a.uncancel() when we drop the guard for a -> b -> a (the last db).
Should we make this a method on attached instead, saying that it resets the database state (e.g. .attached.exit`). This also avoids duplicating the same logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I believe this should be fine though. This could basically cause cancellations to be .. well cancelled. But only if you interleave a bunch of different databases (which ideally we wouldn't even allow imo, but is kind of a necessary evil given the lack of speculative execution).
|
I am not sure how we can make this work as an implementation outside of salsa, given it needs to special case around fixpoint executions. And to me this feels very much like something salsa should offer natively anyways. |
15c4e98 to
c5b15ae
Compare
c5b15ae to
2f99827
Compare
| if let Some(attached) = self.state { | ||
| if let Some(prev) = attached.database.replace(self.prev) { | ||
| // SAFETY: `prev` is a valid pointer to a database. | ||
| unsafe { prev.as_ref().zalsa_local().uncancel() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think I got confused by the variable being named prev. I assumed it's DbGuard.prev when it is the database stored in attached.database (which isn't prev but the current db up to when we started dropping).
I think there's still a risk that we uncancel if we have something like a -> b -> a and a gets cancelled. I understand that we call a.uncancel() when we drop the guard for a -> b -> a (the last db).
Should we make this a method on attached instead, saying that it resets the database state (e.g. .attached.exit`). This also avoids duplicating the same logic.
|
Curious to see how you use this in r-a. |
|
Basically like rust-lang/rust-analyzer#21380. Right now rust-analyzer will run LSP requests to completion even if they were cancelled (unless a user types/changes an input of course) |
Introduces a
CancellationTokenthat can be used to cancel a specific database computation opposed to cancelling the whole runtime. This is traced by storing a second cancellation state onZalsaLocalitself opposed to the runtime cancel flag.When a thread gets cancelled, it will unwind as usual but with a different payload than pending write cancellation, threads blocked on such a cancelled thread will instead of propagating the cancellation run the computation they are blocked on themselves.
The cancellation state of the database gets reset when we exit the database TLS.