Skip to content

[inventory] Add svcs enabled not online to DB#10195

Open
karencfv wants to merge 17 commits intooxidecomputer:mainfrom
karencfv:add-svcs-to-db
Open

[inventory] Add svcs enabled not online to DB#10195
karencfv wants to merge 17 commits intooxidecomputer:mainfrom
karencfv:add-svcs-to-db

Conversation

@karencfv
Copy link
Copy Markdown
Contributor

@karencfv karencfv commented Mar 31, 2026

Follow up to #9977
Closes: #9516

Manual testing with no data from services

$ ./target/debug/omdb db inventory collections show latest --db-url "postgresql://root@[::1]:57066/omicron?sslmode=disable"
<...>
SMF SERVICES STATUS
    no data on SMF services has been collected
<...>

Happy system

$ ./target/debug/omdb db inventory collections show latest --db-url "postgresql://root@[::1]:57066/omicron?sslmode=disable"
<...>
SMF SERVICES STATUS
    0 SMF services enabled but not online at 2026-04-01T18:39:24.057Z
<...>

With failed services

$ ./target/debug/omdb db inventory collections show latest --db-url "postgresql://root@[::1]:57066/omicron?sslmode=disable"
<...>
SMF SERVICES STATUS
    3 SMF services enabled but not online at 2026-04-01T17:54:42.261Z
        FMRI                                ZONE       STATE       
        svc:/site/fake-service:default      global     maintenance 
        svc:/site/fake-service3:default     global     offline     
        svc:/site/fake-service4:default     global     degraded    
<...>

With a command error

$ ./target/debug/omdb db inventory collections show latest --db-url "postgresql://root@[::1]:57066/omicron?sslmode=disable"
<...>
SMF SERVICES STATUS
    failed to retrieve SMF services: command failure: Command [/usr/bin/svcs -Za -FAKE -o state,fmri,zone] executed and failed with status: exit status: 2  stdout:   stderr: /usr/bin/svcs: illegal option -- F
    Usage: svcs [-aHpv] [-o col[,col ... ]] [-R restarter] [-sS col] [-Z | -z zone ]
                [<service> ...]
           svcs -d | -D [-Hpv] [-o col[,col ... ]] [-sS col] [-Z | -z zone ]
                [<service> ...]
           svcs [-l | -L] [-Z | -z zone] <service> ...
           svcs -x [-v] [-Z | -z zone] [<service> ...]
           svcs -?
<...>

@karencfv karencfv marked this pull request as ready for review April 1, 2026 18:40
id UUID NOT NULL,
fmri TEXT NOT NULL,
zone TEXT NOT NULL,
state omicron.public.inv_svc_state_enum NOT NULL,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maybe gets at a design issue with the types on the Rust side, but - what would it mean to have a row in a table named ..._not_online_service with the state online?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you're getting at 🤔 . What about using an enum on the DB side (inv_svc_not_online_state ?) that does not include the online state?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not including the online state is the right call, if the intent of this is to store "not online" statuses. But we probably need to remove it on the Rust side too, right? For a couple reasons: the same question applies there (what would it mean for a SvcsEnabledNotOnline to contain a service in the state SvcState::Online), and what would you do at runtime if you had that and needed to store it in the db?

That does make the type names awkward; it probably needs to be something like SvcState becomes NotOnlineSvcState and Svc becomes NotOnlineSvc? I don't love that but it does seem like we probably don't want the generic-sounding Svc and SvcState types to not be able to represent online services.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

@karencfv karencfv Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end I decided to keep the SvcState enum for the initial parse() method. It seemed just wrong not have a complete enum of all of the possible states as part of the svcs module. I did add additional enums SvcEnabledNotOnlineState and InvSvcEnabledNotOnlineState for the "enabled not online" filtering functionality.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think that makes perfect sense. 👍

Copy link
Copy Markdown
Contributor Author

@karencfv karencfv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! I think I've addressed all of your comments @jgallagher

id UUID NOT NULL,
fmri TEXT NOT NULL,
zone TEXT NOT NULL,
state omicron.public.inv_svc_state_enum NOT NULL,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you're getting at 🤔 . What about using an enum on the DB side (inv_svc_not_online_state ?) that does not include the online state?

@jgallagher
Copy link
Copy Markdown
Contributor

Changes look good - just one nit with the error string conversion, and then the somewhat-painful "what do we do about online services" question in #10195 (comment).

@karencfv karencfv requested a review from jgallagher April 10, 2026 03:21
"offline" => SvcEnabledNotOnlineState::Offline,
"degraded" => SvcEnabledNotOnlineState::Degraded,
"maintenance" => SvcEnabledNotOnlineState::Maintenance,
_ => SvcEnabledNotOnlineState::Unknown,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right - I could call SvcEnabledNotOnlineState::from("sled 23") and get back SvcEnabledNotOnlineState::Unknown, which I guess it kind of is an unknown state but it doesn't seem like we should infallibly convert any string at all into a service state. I'm also not sure we should have an Unknown state variant. It looks like the only use of this is converting from SvcState::Unknown. I think SvcState::Unknown has the same problems: we shouldn't have a blanket From<&str> impl for it, and I'm not sure we actually need an Unknown variant. The only use of that I'm seeing is inside SvcsResult::parse(), where if we get Unknown, we skip the state and don't include it in the result at all.

Could we do this instead:

  • Remove both From<&str> impls
  • Remove both ::Unknown variants
  • Put a private parse_svc_state(&str) -> Option<SvcState> inside illumos-utils/src/svcs.rs, and use None where we're currently using SvcState::Unknown to skip invalid states

?

let mut svcs_enabled_not_online_errors = Vec::new();
for sled_agent in &collection.sled_agents {
match &sled_agent.smf_services_enabled_not_online {
SvcsEnabledNotOnlineResult::SvcsEnabledNotOnline(svcs) => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit - the comments on lines 233 and 244 are the same and maybe not quite right. I wonder if it would be more future-proof to destructure svcs here:

                    let SvcsEnabledNotOnline {
                        services,
                        errors,
                        time_of_status,
                    } = svcs;

and then we wouldn't really need the comments as much - we'd be forced to use all the fields to avoid warnings?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[inventory] Add svcs enabled not online types to DB

2 participants