Skip to content

Avoid leaking sitrep rows during deletion#10143

Merged
smklein merged 14 commits intomainfrom
sitrep-orphan-killing-machine
Apr 2, 2026
Merged

Avoid leaking sitrep rows during deletion#10143
smklein merged 14 commits intomainfrom
sitrep-orphan-killing-machine

Conversation

@smklein
Copy link
Copy Markdown
Collaborator

@smklein smklein commented Mar 24, 2026

Fixes #10131

In an attempt to avoid using transactions during the INSERT pathway (see: #10133)
this PR instead tries to alter how DELETE works for sitreps.

Rather than using a two-step GC process (listing orphaned sitreps and deleting
them - even while concurrent INSERT operations could be underway) this PR
also enables the deletion of "rows within a sitrep, for which the sitrep metadata
record no longer exists".

This allows us to delete rows that were previously leaked, and is also now the
primary mechanism by which rows are removed: we delete the high-level
sitrep first, then remove all these newly-orphaned rows. This is still performed
within a transaction on the DELETE pathway to avoid reading torn sitreps
during the loading phase (as originally appeared in #9594).

.transaction(&conn, |conn| {
async move {
// Step 1: Delete orphaned fm_sitrep metadata rows.
let mut sitreps_deleted = 0usize;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the structure here is basically:

  1. Delete sitrep metadata rows (make orphaned rows)
  2. Clean up orphaned rows (which maybe we just made, or maybe got created by an errant INSERT operation that bailed out)

We could make this transaction smaller by "limiting how many sitrep metadata rows we create at once".

For example, we could:

  • "Only create up to 64 orphans in a txn"
  • ... then clean up all their data, within this transaction (to avoid torn reads)
  • ... then, COMMIT the transaction, and do another batch, until we stop cleaning stuff up / hit some arbitrary limit / etc

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe nitpicky, but this comment was initially a little bit confusing to me, because we are kind of using "orphaned" to refer to two different things: both sitreps that are orphaned due to not being in fm_sitrep_history, and sitrep child rows that are orphaned due to the sitrep metadata row being deleted. Also, later on you refer to "creating orphans", but that actually means deleting a metadata row, rather than actually creating new records. Anyway I just wanted to write that down for clarity.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, addressing the actual idea you proposed here, I agree that paginating the whole transaction might be a good idea to limit the duration of the transaction...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try to update the comment here to make the orphan terminology more clear. I was kinda using "orphaned" to refer to the top-level sitrep, and "deeply orphaned" to refer to child tables of sitreps which no longer exist.

Adding a TODO for the pagination because it will have subtleties that maybe should be reviewed independently of this PR.

@hawkw
Copy link
Copy Markdown
Member

hawkw commented Mar 24, 2026

In an attempt to avoid using transactions during the INSERT pathway (see: #10133) this PR instead tries to alter how DELETE works for sitreps.

Thank you for your support in this matter :D

Copy link
Copy Markdown
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code is kinda wild, i'm impressed --- thank you for humoring me

) -> Result<GcOrphansResult, Error> {
let conn = self.pool_connection_authorized(opctx).await?;

// TODO(eliza): there should probably be an authz object for the fm sitrep?
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙃

@hawkw hawkw added the fault-management Everything related to the fault-management initiative (RFD480 and others) label Mar 25, 2026
@smklein smklein marked this pull request as ready for review March 25, 2026 23:38
@hawkw hawkw added this to the 20 milestone Mar 26, 2026
Copy link
Copy Markdown
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you for adding the macro and stuff! I had a bunch of small notes on the comments etc, but it looks great overall!

// Ensure columns stay aligned even if a child table name is long.
let width = child_tables
.keys()
.map(|name| "orphaned rows deleted:".len() + name.len() + 1)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turbo nit: maybe worth a comment explaining why there's two spaces in orphaned rows deleted? i very briefly thought it was a typo before i realized what this was doing...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

stats.rows_deleted,
);
println!(" {:<width$}{:>NUM_WIDTH$}", " batches:", stats.batches,);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not important, but it occurs to me that maybe if child_tables is empty, we should complain in some way? that would indicate a bug...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a warning

Comment on lines +57 to +59
/// metadata. To add a new child table to the sitrep GC, just add a line
/// here — the GC loop, omdb display, and completeness test all adapt
/// automatically.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: technically, you don't add a line here, you do that in the place where we invoke the macro...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, maybe it's worth briefly describing the syntax so that the person adding a new table is aware of the default column name for the sitrep ID and how it can be overridden (though in practice I think i would complain in a review if someone added a new sitrep child table where it was named anything other than that...)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the docs, added an example of the syntax

.transaction(&conn, |conn| {
async move {
// Step 1: Delete orphaned fm_sitrep metadata rows.
let mut sitreps_deleted = 0usize;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe nitpicky, but this comment was initially a little bit confusing to me, because we are kind of using "orphaned" to refer to two different things: both sitreps that are orphaned due to not being in fm_sitrep_history, and sitrep child rows that are orphaned due to the sitrep metadata row being deleted. Also, later on you refer to "creating orphans", but that actually means deleting a metadata row, rather than actually creating new records. Anyway I just wanted to write that down for clarity.

.transaction(&conn, |conn| {
async move {
// Step 1: Delete orphaned fm_sitrep metadata rows.
let mut sitreps_deleted = 0usize;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, addressing the actual idea you proposed here, I agree that paginating the whole transaction might be a good idea to limit the duration of the transaction...


/// Builds a DELETE query that removes orphaned `fm_sitrep` metadata
/// rows in a single paginated batch (sitreps not in history whose
/// parent is not current). Returns (rows_deleted, next_marker).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turbo nit: perhaps this should have a line saying which test output file contains the text of the query, like you added in #10061?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

/// 1. Finds a batch of distinct `sitrep_id` values in the child table
/// 2. Deletes rows whose `sitrep_id` has no corresponding `fm_sitrep`
/// 3. Returns (rows_deleted, next_marker) where next_marker is NULL
/// when there are no more pages
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: similarly, perhaps a "the SQL generated by this query is in ..." comment?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +50 to +55
Ok(result) => {
status.orphaned_sitreps_deleted = result.sitreps_deleted;
status.sitrep_metadata_batches = result.sitrep_metadata_batches;
status.batch_size = result.batch_size;
status.child_tables = result
.child_tables
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a nitpick, but I'd kind of like it if we exhaustively destructured result here, like:

Suggested change
Ok(result) => {
status.orphaned_sitreps_deleted = result.sitreps_deleted;
status.sitrep_metadata_batches = result.sitrep_metadata_batches;
status.batch_size = result.batch_size;
status.child_tables = result
.child_tables
Ok(GcOrphansResult { sitreps_deleted, sitrep_metadata_batches, batch_size, child_tables }) => {
status.orphaned_sitreps_deleted = sitreps_deleted;
status.sitrep_metadata_batches = sitrep_metadata_batches;
status.batch_size = batch_size;
status.child_tables = child_tables

so that we get a compiler error if someone adds additional fields to GcOrphansResult that we're not handling here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, done

Comment on lines +2603 to +2607
// ---------------------------------------------------------------
// SitrepChildTableCounts: mirrors BlueprintTableCounts from
// deployment.rs to ensure every fm_* child table is covered by
// `SitrepChildTable`.
// ---------------------------------------------------------------
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, take it or leave it: i would like it if this comment described what this thing actually does, in addition to saying "it's like that thing in this other file, and it's used for this purpose" --- it counts the number of rows in each sitrep child table, and the list of sitrep child tables is determined using the SitrepChildTable enum above.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment

Comment on lines +2723 to +2727
"found fm_* child table(s) not covered by \
SitrepChildTable: {}\n\n\
If you added a new fm_* child table, add a variant \
to SitrepChildTable and update the orphan GC code \
in fm_sitrep_gc_orphans.",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turbo nitpick: i might add

Suggested change
"found fm_* child table(s) not covered by \
SitrepChildTable: {}\n\n\
If you added a new fm_* child table, add a variant \
to SitrepChildTable and update the orphan GC code \
in fm_sitrep_gc_orphans.",
"found fm_* child table(s) not covered by \
`SitrepChildTable`: {}\n\n\
If you added a new fm_* child table, add a variant \
to `SitrepChildTable` and update the orphan GC code \
in `fm_sitrep_gc_orphans`.",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, maybe we should say something about what to do if there's a false positive (i.e. if you add a new fm_foo table that should not be covered by orphan GC) --- something like "consider either dropping the fm_ prefix or adding your table name to tables_ignored"?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added back-ticks + advice

Copy link
Copy Markdown
Collaborator Author

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed everything but the pagination - agree it's worth doing, but given that it miiiight increase some complexity I'm punting a bit (leaving as a TODO)

// Ensure columns stay aligned even if a child table name is long.
let width = child_tables
.keys()
.map(|name| "orphaned rows deleted:".len() + name.len() + 1)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

stats.rows_deleted,
);
println!(" {:<width$}{:>NUM_WIDTH$}", " batches:", stats.batches,);
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a warning

Comment on lines +57 to +59
/// metadata. To add a new child table to the sitrep GC, just add a line
/// here — the GC loop, omdb display, and completeness test all adapt
/// automatically.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the docs, added an example of the syntax

) -> Result<GcOrphansResult, Error> {
let conn = self.pool_connection_authorized(opctx).await?;

// TODO(eliza): there should probably be an authz object for the fm sitrep?
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙃

.transaction(&conn, |conn| {
async move {
// Step 1: Delete orphaned fm_sitrep metadata rows.
let mut sitreps_deleted = 0usize;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try to update the comment here to make the orphan terminology more clear. I was kinda using "orphaned" to refer to the top-level sitrep, and "deeply orphaned" to refer to child tables of sitreps which no longer exist.

Adding a TODO for the pagination because it will have subtleties that maybe should be reviewed independently of this PR.


/// Builds a DELETE query that removes orphaned `fm_sitrep` metadata
/// rows in a single paginated batch (sitreps not in history whose
/// parent is not current). Returns (rows_deleted, next_marker).
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

/// 1. Finds a batch of distinct `sitrep_id` values in the child table
/// 2. Deletes rows whose `sitrep_id` has no corresponding `fm_sitrep`
/// 3. Returns (rows_deleted, next_marker) where next_marker is NULL
/// when there are no more pages
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +2603 to +2607
// ---------------------------------------------------------------
// SitrepChildTableCounts: mirrors BlueprintTableCounts from
// deployment.rs to ensure every fm_* child table is covered by
// `SitrepChildTable`.
// ---------------------------------------------------------------
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment

Comment on lines +2723 to +2727
"found fm_* child table(s) not covered by \
SitrepChildTable: {}\n\n\
If you added a new fm_* child table, add a variant \
to SitrepChildTable and update the orphan GC code \
in fm_sitrep_gc_orphans.",
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added back-ticks + advice

Comment on lines +50 to +55
Ok(result) => {
status.orphaned_sitreps_deleted = result.sitreps_deleted;
status.sitrep_metadata_batches = result.sitrep_metadata_batches;
status.batch_size = result.batch_size;
status.child_tables = result
.child_tables
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, done

Copy link
Copy Markdown
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking great, thank you!!

@smklein smklein merged commit cfd87ec into main Apr 2, 2026
16 checks passed
@smklein smklein deleted the sitrep-orphan-killing-machine branch April 2, 2026 15:59
@hawkw
Copy link
Copy Markdown
Member

hawkw commented Apr 2, 2026

yay!

smklein added a commit that referenced this pull request Apr 14, 2026
Follow-up to #10143

Adds pagination during sitrep garbage collection.

While I was there, I realized that we actually don't need transactions
on the delete pathway anymore.

- The delete operation (within the garbage collection pass) deletes
sitreps by deleting out-of-history `fm_sitrep` rows, which immediately
orphans all other sub-tables within the sitrep.
- All the "concurrent reads" of sitreps go through
`fm_sitrep_read_on_conn`, which reads metadata from `fm_sitrep`
**last**. If this sitrep can be read: it has not been deleted yet. If
this sitrep cannot be read: it has been deleted, and prior reads can be
discarded.
- All the "concurrent inserts" of sitreps are contingent on the parent
sitrep not being stale. So: because sitrep insert writes to the
`fm_sitrep` table first, the child rows are "not orphaned" (won't be
GC-ed). This protection lasts for the duration of `fm_sitrep_insert`, OR
until the parent sitrep is marked stale - at which point insert should
fail anyway.

All this is to say: the "read" and "insert" pathways function fine if a
`fm_sitrep` row is deleted non-atomically before subsequent child rows.
Therefore: no transaction is necessary here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fault-management Everything related to the fault-management initiative (RFD480 and others)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sitrep concurrent insert/delete can leak rows

2 participants