Skip to content

hp_proliant_da_cntlr: skip phantom controllers, make 'other' state configurable#928

Closed
Bastian-Kuhn wants to merge 1 commit into
Checkmk:masterfrom
Bastian-Kuhn:fix-hp-proliant-da-cntlr-phantom-and-other-state
Closed

hp_proliant_da_cntlr: skip phantom controllers, make 'other' state configurable#928
Bastian-Kuhn wants to merge 1 commit into
Checkmk:masterfrom
Bastian-Kuhn:fix-hp-proliant-da-cntlr-phantom-and-other-state

Conversation

@Bastian-Kuhn

@Bastian-Kuhn Bastian-Kuhn commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

What

Two fixes for hp_proliant_da_cntlr (HPE ProLiant RAID controllers), both triggered by ProLiant Gen11 / iLO 6:

  1. Skip phantom controllers. The controller table (cpqDaCntlrTable) contains a placeholder row — typically index 0 — whose condition/role/board‑status/board‑condition cells are all 0 (a value the vendor MIB does not define). parse_hp_proliant_da_cntlr already parses such rows to None, but discovery_hp_proliant_da_cntlr iterated over all section keys and still discovered the phantom item, whose service was then permanently UNKNOWN ("Controller not found in SNMP data"). Discovery now skips None entries.

  2. Configurable other state. Gen11 / iLO 6 firmware reports the board condition as other for perfectly healthy controllers, which maps to WARN and pins the service on WARN forever. A new check‑parameter ruleset "HPE ProLiant RAID controller" remaps the monitoring state for the other value of the controller condition, board condition and board status independently.

Compatibility

Defaults stay WARN, i.e. identical to today's behaviour until a rule is configured. Existing phantom services disappear on the next service discovery.

Tests

tests/unit/.../test_hp_proliant_da_cntlr.py updated: phantom row no longer discovered, check functions take params, plus a new test for the other→OK remap.

Draft — opened for upstream review. Let me know if you'd like a werk added.

…nfigurable

On HPE ProLiant Gen11 / iLO 6 the controller table (cpqDaCntlrTable) contains
a phantom placeholder row -- typically index 0 -- whose condition, role, board
status and board condition cells are all '0' (a value the vendor MIB does not
define). parse_hp_proliant_da_cntlr already parsed such rows to None, but the
discovery function iterated over all section keys and still discovered the
phantom item, whose service was then permanently UNKNOWN ('Controller not found
in SNMP data'). Discovery now skips None entries.

In addition, Gen11 / iLO 6 firmware reports the board condition as 'other' for
healthy controllers, which maps to WARN and pins the service on WARN forever.
Add a check parameter ruleset 'HPE ProLiant RAID controller' that remaps the
state for the 'other' value of the controller condition, board condition and
board status independently. Defaults remain WARN, so behaviour is unchanged
until a rule is configured.

Update the unit test accordingly.
@Bastian-Kuhn Bastian-Kuhn force-pushed the fix-hp-proliant-da-cntlr-phantom-and-other-state branch from 9a49d05 to 255b71d Compare July 3, 2026 12:38
@Bastian-Kuhn

Copy link
Copy Markdown
Contributor Author

Superseded: split into #929 (phantom-controller discovery fix) and #930 (configurable 'other' state ruleset).

@Bastian-Kuhn Bastian-Kuhn deleted the fix-hp-proliant-da-cntlr-phantom-and-other-state branch July 3, 2026 12:42
@github-actions github-actions Bot locked and limited conversation to collaborators Jul 3, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant