Skip to content

[ci-operator]: expand the pod lifecycle metrics to include the state of the machinesets#4938

Open
droslean wants to merge 1 commit intoopenshift:mainfrom
droslean:metrics-99
Open

[ci-operator]: expand the pod lifecycle metrics to include the state of the machinesets#4938
droslean wants to merge 1 commit intoopenshift:mainfrom
droslean:metrics-99

Conversation

@droslean
Copy link
Member

/cc @openshift/test-platform

@openshift-ci-robot
Copy link
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci openshift-ci bot requested a review from a team February 11, 2026 13:10
@droslean
Copy link
Member Author

/hold

@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

Walkthrough

Adds MachineAutoscaler awareness to metrics: dependency updates, scheme registration, listing autoscalers at init, passing autoscaler data into PodLifecyclePlugin, new types for per-machine-set and workload capacity, and test updates to validate workload capacity aggregation.

Changes

Cohort / File(s) Summary
Dependency & go.mod
go.mod
Bumped OpenShift module pseudo-versions and added github.com/openshift/cluster-autoscaler-operator dependency.
Metrics initialization
pkg/metrics/metrics.go
Registers machine and autoscaling schemes, lists MachineAutoscaler resources on init (logs on failure), introduces CIWorkloadLabel, and updates NewPodLifecyclePlugin signature to accept autoscaler items.
Metrics core logic
pkg/metrics/pods.go
Adds MachineSetCount and WorkloadNodeCount types; extends PodLifecycleMetricsEvent with WorkloadCapacity; adds autoscalers field to PodLifecyclePlugin; implements getMinMax and getWorkloadCounts; uses CIWorkloadLabel when recording.
Tests
pkg/metrics/pods_test.go
Moves tests to controller-runtime fake client and registers machine API scheme; introduces MachineSet/MachineAutoscaler fixtures; adapts test scaffolding to use object lists and autoscalers; updates expectations to include WorkloadCapacity and adds int32Ptr helper.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 11, 2026
@droslean
Copy link
Member Author

/test e2e

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 11, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: droslean

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 11, 2026
@droslean
Copy link
Member Author

/test e2e

@droslean
Copy link
Member Author

/test e2e

@droslean
Copy link
Member Author

/retest

@droslean
Copy link
Member Author

/test e2e

…of the machinesets

Signed-off-by: Nikolaos Moraitis <nmoraiti@redhat.com>
@droslean
Copy link
Member Author

/test e2e

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
pkg/metrics/pods_test.go (1)

21-23: Consider registering the autoscalingv1beta1 scheme for test consistency.

The test init() only registers machinev1beta1 scheme. While the current tests don't create autoscaler objects via the fake client (they're passed directly), adding the autoscaling scheme registration would improve consistency with production code.

♻️ Suggested improvement
 func init() {
 	machinev1beta1.AddToScheme(scheme.Scheme)
+	autoscalingv1beta1.SchemeBuilder.AddToScheme(scheme.Scheme)
 }
pkg/metrics/pods.go (1)

167-213: Consider scoping MachineSet listing to the machine-api namespace.

The client.List for MachineSets on line 170 fetches cluster-wide. In OpenShift clusters, MachineSets typically reside in openshift-machine-api. Scoping the list could improve performance and avoid unexpected results from MachineSets in other namespaces.

♻️ Suggested improvement
 func (p *PodLifecyclePlugin) getWorkloadCounts(workload string) WorkloadNodeCount {
 	ret := WorkloadNodeCount{Workload: workload}
 	machineSetList := &machinev1beta1.MachineSetList{}
-	if err := p.client.List(p.ctx, machineSetList); err != nil {
+	if err := p.client.List(p.ctx, machineSetList, ctrlruntimeclient.InNamespace(MachineAPINamespace)); err != nil {
 		p.logger.WithError(err).Warn("Failed to list MachineSets")
 		return WorkloadNodeCount{}
 	}

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 16, 2026

@droslean: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/lint b6bdef7 link true /test lint
ci/prow/e2e b6bdef7 link true /test e2e
ci/prow/images b6bdef7 link true /test images
ci/prow/breaking-changes b6bdef7 link false /test breaking-changes

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants