Skip to content

dns update during blueprint execution uses wrong version number for precondition#10252

Open
davepacheco wants to merge 2 commits intomainfrom
dap/dns-execution-fix
Open

dns update during blueprint execution uses wrong version number for precondition#10252
davepacheco wants to merge 2 commits intomainfrom
dap/dns-execution-fix

Conversation

@davepacheco
Copy link
Copy Markdown
Collaborator

Fixes #10251.

@davepacheco davepacheco self-assigned this Apr 9, 2026
info!(
log,
"attempting to update from generation {} to generation {}",
dns_config_current.generation,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the from log here be blueprint_generation?

// the blueprint was generated. But we have to check later when we try to
// update the database anyway. And we're not wasting much effort allowing
// this proceed for now. This way, we have only one code path for this and
// we know it's being hit when we exercise this condition.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still accurate? Specifically the "we're not wasting much effort allowing this to proceed" bit - it looks like we could know at this point that we're already guaranteed to fail (by comparing the blueprint DNS version against what we just loaded from the DB), right? By not checking this, we have to do what looks like a nontrivial amount of work including a bunch of db queries (list all silos, list all dns zones, then try an insert that we know is doomed)?

.blueprint_target_get_current_full(&opctx)
.await
.expect("failed to read current target blueprint");
// Override the CockroachDB settings so that we don't try to set them.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really here or there, but - why do we need to do this?

// above.
let ip_pool_range_rows =
fetch_all_service_ip_pool_ranges(&datastore, &opctx).await;
let external_ip_policy = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're testing internal DNS, I think we could trim the boilerplate quite a bit here by picking a zone type that doesn't need as much configuration (e.g., add another crucible pantry or oximeter instead of Nexus).

Copy link
Copy Markdown
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the things john said here, but also, the one line that actually changes the generation supplied LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reconfigurator DNS execution may overwrite fresh data with stale data

3 participants