Skip to content

[api] Add endpoint that tells you the current system version#10250

Open
david-crespo wants to merge 2 commits intomainfrom
system-version-api
Open

[api] Add endpoint that tells you the current system version#10250
david-crespo wants to merge 2 commits intomainfrom
system-version-api

Conversation

@david-crespo
Copy link
Copy Markdown
Contributor

@david-crespo david-crespo commented Apr 8, 2026

I am mulling over docs versioning, and while the plan there hasn't been fully ironed out (RFD 458 needs to be finished), an uncontroversial part of it is that an end user should be able to ask the system for its current software version. There are several versions of this problem discussed in #8292 (comment) — the one I am aiming for here is (as @davepacheco put it here): "the release whose Nexus is running right now".

As discussed in RFD 588 Nexus handoff during upgrade, we are only ever serving API traffic on one version of nexus at a time. During an update, we upgrade some sleds to the new version while old Nexus instances keep serving API traffic, and then we cut over to the new instances all at once. All this is to say: I think the releng system version hard-coded into Nexus is a good answer to "the release whose Nexus is running right now". Using a hard-coded value has the advantage that it requires no DB call and therefore the endpoint can be served without any authentication.

As I was working on this, I got a little wrapped around the axle thinking about whether/how to also include the git hash like we do for the TUF repo version strings (see code below). While I do think that would be useful to some end users, who might enjoy being able to go to GitHub and see the exact code that's running, this is not related to my immediate problem and we can solve it separately in a followup. In the meantime this endpoint is quite useful as-is.

let commit = Command::new(&args.git_bin)
.args(["rev-parse", "HEAD"])
.ensure_stdout(&logger)
.await?
.trim()
.to_owned();
let mut version = BASE_VERSION.clone();
// Differentiate between CI and local builds. We use `0.word` as the
// prerelease field because it comes before `alpha`.
version.pre =
if std::env::var_os("CI").is_some() { "0.ci" } else { "0.local" }
.parse()?;
// Set the build metadata to the current commit hash.
let mut build = String::with_capacity(14);
build.push_str("git");
build.extend(commit.chars().take(11));
version.build = build.parse()?;
let version_str = version.to_string();
info!(logger, "version: {}", version_str);

Debatable choices

  • How to get the release version
    • I just moved the constant from releng to omicron-common but maybe that feels bad given the purpose of the constant?
  • /v1/system/version is kind of a weird path for an endpoint that is accessible without auth — all the rest of the /v1/system/ endpoints are meant for operators.
    • Could do /v1/version or /v1/system-version ?
    • Could even roll it into /v1/ping, though I would never think to look there, so I don't like this idea

/// Fetch system version
///
/// Returns the current version of the Oxide software running on this
/// system.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to word this more precisely.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the term "Oxide software" could mean many things. For the purpose of this endpoint do you want the API version specifically?

During an update several components will be on different versions during a period of time. This endpoint may confuse our customers during this period thinking all components are running on said version, when in reality they might not be. Because of this, I would not call it "system version".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly mean API version. The most literal thing is "API and control plane version" — my understanding is that these are the same thing because when we do handoff from old Nexus to new Nexus, we are switching to the new API and to the new control plane at the same time.

I don't want users to have to see or think about what we normally call the API version — 2026032500.0.0. We use it internally because we have many API versions between releases, but for the end user, API versions correspond 1-1 with release numbers like v19. I want to key the versioned docs off the release number and turn 2026032500.0.0 into more of an implementation detail. This might mean we should add the release version to the OpenAPI schema metadata.

So I agree that "system version" is not a good name for this endpoint at all. "API version" is better. "Control plane version" is pretty direct, but I am worried about the user being able to infer everything they should from that. It's probably a lot easier to infer from "API version" that it doesn't just mean the API shape but also the machinery powering the API. So "API version" is the best option to me so far.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on "API version". I don't love "control plane version", because during an update, "the control plane" is on a mix of versions. But because of the way Nexus handoff is structured, there's always exactly one "API version".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, "API version" is the most correct term

Copy link
Copy Markdown
Contributor

@karencfv karencfv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting the ball rolling! I think we've wanted this endpoint since before I joined the company? 😄

I left a comment, but I think we definitely want @davepacheco to take a look before merging.

/// Fetch system version
///
/// Returns the current version of the Oxide software running on this
/// system.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the term "Oxide software" could mean many things. For the purpose of this endpoint do you want the API version specifically?

During an update several components will be on different versions during a period of time. This endpoint may confuse our customers during this period thinking all components are running on said version, when in reality they might not be. Because of this, I would not call it "system version".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants