[api] Add endpoint that tells you the current system version#10250
[api] Add endpoint that tells you the current system version#10250david-crespo wants to merge 2 commits intomainfrom
Conversation
| /// Fetch system version | ||
| /// | ||
| /// Returns the current version of the Oxide software running on this | ||
| /// system. |
There was a problem hiding this comment.
Probably want to word this more precisely.
There was a problem hiding this comment.
I think the term "Oxide software" could mean many things. For the purpose of this endpoint do you want the API version specifically?
During an update several components will be on different versions during a period of time. This endpoint may confuse our customers during this period thinking all components are running on said version, when in reality they might not be. Because of this, I would not call it "system version".
There was a problem hiding this comment.
I mostly mean API version. The most literal thing is "API and control plane version" — my understanding is that these are the same thing because when we do handoff from old Nexus to new Nexus, we are switching to the new API and to the new control plane at the same time.
I don't want users to have to see or think about what we normally call the API version — 2026032500.0.0. We use it internally because we have many API versions between releases, but for the end user, API versions correspond 1-1 with release numbers like v19. I want to key the versioned docs off the release number and turn 2026032500.0.0 into more of an implementation detail. This might mean we should add the release version to the OpenAPI schema metadata.
So I agree that "system version" is not a good name for this endpoint at all. "API version" is better. "Control plane version" is pretty direct, but I am worried about the user being able to infer everything they should from that. It's probably a lot easier to infer from "API version" that it doesn't just mean the API shape but also the machinery powering the API. So "API version" is the best option to me so far.
There was a problem hiding this comment.
+1 on "API version". I don't love "control plane version", because during an update, "the control plane" is on a mix of versions. But because of the way Nexus handoff is structured, there's always exactly one "API version".
There was a problem hiding this comment.
I agree, "API version" is the most correct term
There was a problem hiding this comment.
Thanks for getting the ball rolling! I think we've wanted this endpoint since before I joined the company? 😄
I left a comment, but I think we definitely want @davepacheco to take a look before merging.
| /// Fetch system version | ||
| /// | ||
| /// Returns the current version of the Oxide software running on this | ||
| /// system. |
There was a problem hiding this comment.
I think the term "Oxide software" could mean many things. For the purpose of this endpoint do you want the API version specifically?
During an update several components will be on different versions during a period of time. This endpoint may confuse our customers during this period thinking all components are running on said version, when in reality they might not be. Because of this, I would not call it "system version".
I am mulling over docs versioning, and while the plan there hasn't been fully ironed out (RFD 458 needs to be finished), an uncontroversial part of it is that an end user should be able to ask the system for its current software version. There are several versions of this problem discussed in #8292 (comment) — the one I am aiming for here is (as @davepacheco put it here): "the release whose Nexus is running right now".
As discussed in RFD 588 Nexus handoff during upgrade, we are only ever serving API traffic on one version of nexus at a time. During an update, we upgrade some sleds to the new version while old Nexus instances keep serving API traffic, and then we cut over to the new instances all at once. All this is to say: I think the releng system version hard-coded into Nexus is a good answer to "the release whose Nexus is running right now". Using a hard-coded value has the advantage that it requires no DB call and therefore the endpoint can be served without any authentication.
As I was working on this, I got a little wrapped around the axle thinking about whether/how to also include the git hash like we do for the TUF repo version strings (see code below). While I do think that would be useful to some end users, who might enjoy being able to go to GitHub and see the exact code that's running, this is not related to my immediate problem and we can solve it separately in a followup. In the meantime this endpoint is quite useful as-is.
omicron/dev-tools/releng/src/main.rs
Lines 245 to 264 in b2b1e39
Debatable choices
relengtoomicron-commonbut maybe that feels bad given the purpose of the constant?/v1/system/versionis kind of a weird path for an endpoint that is accessible without auth — all the rest of the/v1/system/endpoints are meant for operators./v1/versionor/v1/system-version?/v1/ping, though I would never think to look there, so I don't like this idea