STOR-2893: add storage BYOK feature tests#30786
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@Phaow: This pull request references STOR-2893 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@Phaow: This pull request references STOR-2893 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Phaow The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retitle STOR-2893: add storage BYOK feature tests |
WalkthroughThe pull request adds IBM Cloud provider support to the cluster discovery system, introduces a new end-to-end CSI BYOK (Bring Your Own Key) validation test, and defines platform-specific provisioner configurations for AWS, GCP, Azure, and IBM Cloud with support detection and storage class mappings. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Suite
participant Provider as Provider Detection
participant Config as Provisioner Config
participant Driver as ClusterCSIDriver
participant SC as StorageClass
Test->>Provider: Detect cloud platform
Provider-->>Test: Platform identified
Test->>Config: GetBYOKProvisionerNames(platform)
Config-->>Test: List of BYOK provisioners
loop For each provisioner
Test->>Config: GetProvisionerByName(provisioner)
Config-->>Test: Provisioner config or nil
alt Config exists
Test->>Driver: Read ClusterCSIDriver
Driver-->>Test: Driver configuration
Test->>Test: Extract BYOK key ID by provider type
Test->>Config: getPresetStorageClassNamesByProvisioner()
Config-->>Test: Storage class names
loop For each storage class
Test->>SC: Fetch StorageClass
SC-->>Test: StorageClass object
Test->>Test: Verify BYOK parameter matches key ID
end
else
Test->>Test: Skip provisioner
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~23 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
|
@Phaow: This pull request references STOR-2893 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/payload 4.22 nightly blocking |
|
@Phaow: trigger 14 job(s) of type blocking for the nightly release of OCP 4.22
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/59ff5e70-1967-11f1-8951-ff2be3f06a63-0 |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/extended/storage/csi_byok.go`:
- Around line 64-67: The current test silently skips when
oc.AdminKubeClient().StorageV1().StorageClasses().Get(...) returns an error
(using g.Skip), which masks failures; change the behavior to fail the test
instead — replace the g.Skip(...) call for the StorageClasses().Get error with a
test failure call (e.g., e2e.Failf or the test suite's Failf) and include the
storage class name and error in the failure message; ensure this change is
applied around the retrieval of scName so missing managed storage classes cause
a test failure rather than a skip.
- Around line 48-51: The test currently treats an empty result from
getByokKeyIDFromClusterCSIDriver as "not BYOK" and skips, which hides API/read
failures; change getByokKeyIDFromClusterCSIDriver to return (string, error)
instead of just string, propagate the error to the caller in the test (where
byokKeyID := getByokKeyIDFromClusterCSIDriver(oc, provisioner) is called), and
update the test to fail (e.g., g.Fatal/g.Fatalf) when err != nil while only
calling g.Skip when err == nil and the returned key ID is empty; update all call
sites accordingly so real API errors surface as test failures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 85f78b6f-b37d-4d55-9930-99e31dd09b5b
📒 Files selected for processing (4)
pkg/clioptions/clusterdiscovery/cluster.gopkg/clioptions/clusterdiscovery/provider.gotest/extended/storage/const.gotest/extended/storage/csi_byok.go
|
Scheduling required tests: |
|
/retest |
|
/payload-job periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn-upgrade |
|
@Phaow: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/39cc1de0-1ae4-11f1-97dc-2dc0d95b78ef-0 |
|
@Phaow: This pull request references STOR-2893 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
test/extended/storage/csi_byok.go (1)
64-67:⚠️ Potential issue | 🟠 MajorDon't skip when an expected managed
StorageClassis missing.These names are sourced from static repo config, so a
Get()error here is the regression this test is supposed to catch. Skipping turns a broken managed class into a false green.Suggested change
sc, err := oc.AdminKubeClient().StorageV1().StorageClasses().Get(context.Background(), scName, metav1.GetOptions{}) - if err != nil { - g.Skip(fmt.Sprintf("Storage class %s not found in cluster: %v", scName, err)) - } + o.Expect(err).NotTo(o.HaveOccurred(), + fmt.Sprintf("expected managed storage class %s to exist", scName))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended/storage/csi_byok.go` around lines 64 - 67, The test currently calls g.Skip when oc.AdminKubeClient().StorageV1().StorageClasses().Get(context.Background(), scName, metav1.GetOptions{}) returns an error, which hides regressions; replace the g.Skip(...) call with a hard test failure that surfaces the error (e.g., use t.Fatalf or the framework's Fail/Failf method) so a missing managed StorageClass causes the test to fail and include scName and err in the failure message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/extended/storage/csi_byok.go`:
- Around line 55-58: The test currently skips when presetStorageClassNames is
empty, but since presetStorageClassNames is derived from
provisionerInfo.ManagedStorageClassNames the test should fail instead; replace
the g.Skip(...) call in the block that checks len(presetStorageClassNames) == 0
with a test failure (e.g., g.Fatalf or g.Fail with a clear message) so the test
fails when a BYOK-capable provisioner has no managed storage classes,
referencing provisioner and provisionerInfo.ManagedStorageClassNames in the
message.
---
Duplicate comments:
In `@test/extended/storage/csi_byok.go`:
- Around line 64-67: The test currently calls g.Skip when
oc.AdminKubeClient().StorageV1().StorageClasses().Get(context.Background(),
scName, metav1.GetOptions{}) returns an error, which hides regressions; replace
the g.Skip(...) call with a hard test failure that surfaces the error (e.g., use
t.Fatalf or the framework's Fail/Failf method) so a missing managed StorageClass
causes the test to fail and include scName and err in the failure message.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 50929a1f-5650-4128-8d17-c8a5085dbe44
📒 Files selected for processing (4)
pkg/clioptions/clusterdiscovery/cluster.gopkg/clioptions/clusterdiscovery/provider.gotest/extended/storage/const.gotest/extended/storage/csi_byok.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/clioptions/clusterdiscovery/provider.go
- test/extended/storage/const.go
|
@Phaow: This pull request references STOR-2893 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Signed-off-by: Penghao <pewang@redhat.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/extended/storage/const.go (1)
126-130: Return the actual provisioner entry, not the range copy.On Line 130,
return &preturns a pointer to thefor ... rangecopy, not theProvisionerInfostored inPlatforms. Current callers only read from it, but any future mutation through this API will silently update only the copy.♻️ Proposed fix
func GetProvisionerByName(provisioner string) *ProvisionerInfo { for _, config := range Platforms { - for _, p := range config.Provisioners { - if p.Name == provisioner { - return &p + for i := range config.Provisioners { + if config.Provisioners[i].Name == provisioner { + return &config.Provisioners[i] } } } return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended/storage/const.go` around lines 126 - 130, GetProvisionerByName currently returns &p which is the address of the range loop copy; change iteration to index-based so you return the pointer to the actual slice element inside Platforms. Replace both range loops with index-based loops (e.g. for ci := range Platforms { for pi := range Platforms[ci].Provisioners { if Platforms[ci].Provisioners[pi].Name == provisioner { return &Platforms[ci].Provisioners[pi] } } }) so the returned *ProvisionerInfo points to the real entry in Platforms rather than a temporary copy.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/extended/storage/const.go`:
- Around line 126-130: GetProvisionerByName currently returns &p which is the
address of the range loop copy; change iteration to index-based so you return
the pointer to the actual slice element inside Platforms. Replace both range
loops with index-based loops (e.g. for ci := range Platforms { for pi := range
Platforms[ci].Provisioners { if Platforms[ci].Provisioners[pi].Name ==
provisioner { return &Platforms[ci].Provisioners[pi] } } }) so the returned
*ProvisionerInfo points to the real entry in Platforms rather than a temporary
copy.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3130ea1d-7459-4a9c-9280-816e17d5aa6e
📒 Files selected for processing (4)
pkg/clioptions/clusterdiscovery/cluster.gopkg/clioptions/clusterdiscovery/provider.gotest/extended/storage/const.gotest/extended/storage/csi_byok.go
|
@Phaow: This pull request references STOR-2893 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/extended/storage/csi_byok.go (2)
39-51: Callingg.Skip()inside the loop terminates the entire spec.If the cluster has multiple BYOK-capable provisioners and one isn't BYOK-configured, the test skips without verifying the others. For example, if provisioner A is verified successfully but provisioner B triggers the skip at line 50, the entire test is marked "skipped" rather than "passed with some provisioners skipped."
If partial verification is acceptable, consider using
continuewith logging instead ofg.Skip()to allow other provisioners to be checked:♻️ Alternative approach
byokKeyID := getByokKeyIDFromClusterCSIDriver(oc, provisioner) if len(byokKeyID) == 0 { - g.Skip("Skipped: the cluster is not byok cluster, no key settings in clustercsidriver/" + provisioner) + e2e.Logf("Skipping provisioner %s: no BYOK key settings in ClusterCSIDriver", provisioner) + continue }Alternatively, if all-or-nothing BYOK configuration is expected, this is acceptable as-is.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended/storage/csi_byok.go` around lines 39 - 51, The loop over featureProviderSupportProvisioners currently calls g.Skip() when getByokKeyIDFromClusterCSIDriver(oc, provisioner) returns an empty key, which aborts the entire spec; change this to log a warning and continue to the next provisioner instead (replace the g.Skip(...) call with a log/ginkgo skip-message via t.Logf or g.Log and a continue) so other provisioners are still verified; keep the existing g.Skip usage only if you intentionally require an all-or-nothing BYOK configuration.
43-45: This skip condition should never trigger; consider asserting instead.The
provisionervariable comes fromfeatureProviderSupportProvisioners, which is populated byGetBYOKProvisionerNames(cloudProvider). That function retrieves names from the samePlatformsmap thatGetProvisionerByNamesearches. If a provisioner name is returned by one, it will always be found by the other.If this condition ever triggers, it indicates a configuration inconsistency that should fail the test, not be silently skipped.
♻️ Suggested change
provisionerInfo := GetProvisionerByName(provisioner) - if provisionerInfo == nil { - g.Skip("Provisioner not found in configuration: " + provisioner) - } + o.Expect(provisionerInfo).NotTo(o.BeNil(), + "provisioner %s returned by GetBYOKProvisionerNames but not found in configuration", provisioner)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended/storage/csi_byok.go` around lines 43 - 45, The nil-check on provisionerInfo should assert failure rather than skip because provisioner names come from GetBYOKProvisionerNames and must exist; replace the g.Skip("Provisioner not found...") in the block checking provisionerInfo == nil with a hard test failure (e.g., call g.Fatalf/g.Failf or equivalent) so the test fails loudly on this configuration inconsistency; reference symbols: provisionerInfo, provisioner, featureProviderSupportProvisioners, GetBYOKProvisionerNames, GetProvisionerByName, and g.Skip.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/extended/storage/csi_byok.go`:
- Around line 39-51: The loop over featureProviderSupportProvisioners currently
calls g.Skip() when getByokKeyIDFromClusterCSIDriver(oc, provisioner) returns an
empty key, which aborts the entire spec; change this to log a warning and
continue to the next provisioner instead (replace the g.Skip(...) call with a
log/ginkgo skip-message via t.Logf or g.Log and a continue) so other
provisioners are still verified; keep the existing g.Skip usage only if you
intentionally require an all-or-nothing BYOK configuration.
- Around line 43-45: The nil-check on provisionerInfo should assert failure
rather than skip because provisioner names come from GetBYOKProvisionerNames and
must exist; replace the g.Skip("Provisioner not found...") in the block checking
provisionerInfo == nil with a hard test failure (e.g., call g.Fatalf/g.Failf or
equivalent) so the test fails loudly on this configuration inconsistency;
reference symbols: provisionerInfo, provisioner,
featureProviderSupportProvisioners, GetBYOKProvisionerNames,
GetProvisionerByName, and g.Skip.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: bce46845-11d8-4c03-9c3f-0faa0bf01303
📒 Files selected for processing (4)
pkg/clioptions/clusterdiscovery/cluster.gopkg/clioptions/clusterdiscovery/provider.gotest/extended/storage/consts.gotest/extended/storage/csi_byok.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/clioptions/clusterdiscovery/cluster.go
- pkg/clioptions/clusterdiscovery/provider.go
|
/retest |
|
/test okd-scos-images |
|
Scheduling required tests: |
|
/retest-required |
|
@Phaow: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Risk analysis has seen new tests most likely introduced by this PR. New Test Risks for sha: 5e53e5f
New tests seen in this PR at sha: 5e53e5f
|
|
@Phaow: This pull request references STOR-2893 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Hi @jsafrane @duanwei33 , could you please help take a look when you get a chance? Thank you! ^^ |
|
/assign @jsafrane @duanwei33 |
Why we need this?
Special notes for your reviewer:
aws BYOK:
periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-localzone-byo-subnet-role-sts-kms-mini-perm-f14azure BYOK:
periodic-ci-openshift-openshift-tests-private-release-4.22-multi-nightly-azure-ipi-des-arm-regen-cert-f14/2027321272165208064/artifacts/azure-ipi-des-arm-regen-cert-f14gcp BYOK:
periodic-ci-openshift-openshift-tests-private-release-4.22-multi-nightly-gcp-ipi-disk-encryption-arm-f14-cert-manager-custom-certibmcloud BYOK:
periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-ibmcloud-ipi-private-byo-kms-f28 (with IBMCLOUD_DEFAULT_MACHINE_ENCRYPTION_KEY: "true")Test records
Summary by CodeRabbit
New Features
Tests