Skip to content

OCPBUGS-86299: e2e/ccm-aws-ote: support to dual-stack IPv6 primary#466

Merged
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
mtulio:OCPBUGS-86299
May 27, 2026
Merged

OCPBUGS-86299: e2e/ccm-aws-ote: support to dual-stack IPv6 primary#466
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
mtulio:OCPBUGS-86299

Conversation

@mtulio
Copy link
Copy Markdown
Contributor

@mtulio mtulio commented May 21, 2026

Summary

Upstream cloud-provider-aws load balancer tests do not support dual-stack clusters — they create single-stack NLB services that fail when the cluster network requires dual-stack IPFamilyPolicy. This PR detects dual-stack configuration from the CCM cloud-config and adapts the test suite accordingly.

Changes

  • Exclude upstream LB tests on dual-stack IPv6-primary clusters: Detect dual-stack at startup by reading the NodeIPFamilies key from the cloud-config ConfigMap. When the cluster is dual-stack with IPv6 as primary family, upstream [cloud-provider-aws-e2e] loadbalancer tests are excluded from the spec selector since upstream does not support this configuration yet. IPv4-primary dual-stack and single-stack clusters continue to run the full upstream test suite.

  • Patch NLB services for dual-stack: The downstream createServiceNLB helper now sets IPFamilyPolicy=RequireDualStack on NLB services when dual-stack is detected, so the AWSServiceLBNetworkSecurityGroup tests create services matching the cluster's network configuration.

  • Consolidate cloud-config helpers into common package:

    • GetCloudConfig() retrieves the CCM cloud-config ConfigMap (supports being called from main.go outside Ginkgo context).
    • IsConfigPresentCloudConfig() parses INI-style key=value entries from cloud-config data, with comma-delimited multi-value support.
    • IsNLBSecurityGroupModeManaged() replaces inline regex checks for the NLB managed SG mode.
    • IsDualStack() returns (isDualStack, primaryIPv6, error) by parsing NodeIPFamilies.
    • GetOcClient() / GetKubeClient() extracted from IsFeatureEnabled() for reuse.

Test plan

  • Verify upstream LB tests are excluded on dual-stack IPv6-primary clusters (list-tests should not contain [cloud-provider-aws-e2e] loadbalancer entries)
  • Verify upstream LB tests are included on single-stack and IPv4-primary dual-stack clusters
  • Verify AWSServiceLBNetworkSecurityGroup tests create dual-stack NLB services on dual-stack clusters

🤖 Generated with Claude Code

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels May 21, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@mtulio: This pull request references Jira Issue OCPBUGS-86299, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds shared CCM cloud-config helpers and kube/OpenShift client builders, detects dual-stack at test startup to conditionally filter specs, and refactors AWS NLB tests to use helpers and set Service.IPFamilyPolicy for dual-stack clusters.

Changes

NLB test infrastructure refactoring for cloud-config and dual-stack support

Layer / File(s) Summary
Cloud-config and client helpers
openshift-tests/ccm-aws-tests/e2e/common/helper.go
Adds GetKubeClient(ctx) / GetOcClient(ctx) and GetCloudConfig, IsConfigPresentCloudConfig, IsNLBSecurityGroupModeManaged, and IsDualStack for parsing CCM cloud-config.
Test init and conditional spec selection
openshift-tests/ccm-aws-tests/main.go
Imports common, loads cloud-config at startup, sets package dual-stack flags via common.IsDualStack, and excludes upstream AWS loadbalancer specs when dual-stack is detected.
NLB tests updated to use shared helpers
openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
Imports common, removes local cloud-config constants, refactors NLBSecurityGroupMode test to use common.GetCloudConfig/IsNLBSecurityGroupModeManaged, and updates createServiceNLB to detect dual-stack and set IPFamilyPolicy = RequireDualStack before Service creation.
Dependency promotion
openshift-tests/ccm-aws-tests/go.mod
Promotes github.com/openshift/api from indirect to direct requirement.

Sequence Diagram

sequenceDiagram
  participant main as main (test runner)
  participant common as openshift-tests/ccm-aws-tests/e2e/common
  participant kube as Kubernetes API (ConfigMap/Service)
  participant tests as NLB test code

  main->>common: GetCloudConfig(ctx)
  common->>kube: GET ConfigMap openshift-cloud-controller-manager/cloud-conf
  kube-->>common: ConfigMap
  common->>common: IsDualStack(cm)
  common-->>main: isDualStack bool
  main->>tests: build spec selectors (exclude upstream loadbalancer when isDualStack)
  tests->>common: GetCloudConfig(ctx) / IsNLBSecurityGroupModeManaged(cm)
  tests->>common: IsDualStack(cm)
  tests->>kube: Services(...).Create(...) with IPFamilyPolicy=RequireDualStack (if dual-stack)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Suggested labels

lgtm

Suggested reviewers

  • racheljpg
  • theobarberbany
🚥 Pre-merge checks | ✅ 9 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Test Structure And Quality ⚠️ Warning Test quality issues: 5 typos ("creatomg" vs "creating") in By() statements; IsDualStack lacks case normalization for IPv4/IPv6, causing silent failures. Fix typos in By() statements (lines 111, 158, 246, 357, 440) and apply strings.ToLower() in IsDualStack switch for case-insensitive IP family matching.
Microshift Test Compatibility ⚠️ Warning Tests reference unavailable openshift-cloud-controller-manager namespace without MicroShift protection mechanisms (apigroup tag, skip label, or runtime checks). Add [apigroup:operator.openshift.io] tag to test name or add exutil.IsMicroShiftCluster() guard since openshift-cloud-controller-manager namespace does not exist on MicroShift.
Single Node Openshift (Sno) Test Compatibility ⚠️ Warning Six new Ginkgo e2e tests in loadbalancer.go lack SNO compatibility guards ([Skipped:SingleReplicaTopology] or runtime topology checks). Add [Skipped:SingleReplicaTopology] label to test names or guard with exutil.IsSingleNode() check if tests require multi-node clusters.
✅ Passed checks (9 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 88.24% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names use static constants and strings. No dynamic values, pod names, timestamps, UUIDs, IPs, or node names in any test definitions.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only E2E test code (openshift-tests/ccm-aws-tests/), not operator/controller code or deployment manifests. No topology-incompatible scheduling constraints are introduced.
Ote Binary Stdout Contract ✅ Passed All logging in process-level code (main.go) uses either logrus (which defaults to stderr) or framework.Logf (which uses ginkgo.GinkgoWriter). No stdout writes detected.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No new Ginkgo tests are added. Changes only modify existing tests and add helper functions without IPv4 assumptions or external connectivity.
Title check ✅ Passed The title accurately summarizes the main change: adding dual-stack IPv6 primary support to the CCM AWS e2e tests, which is the core objective of the PR.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 21, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link
Copy Markdown

@mtulio: This pull request references Jira Issue OCPBUGS-86299, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary by CodeRabbit

  • Documentation

  • Updated AWS Cloud Controller Manager e2e test guide with prerequisites, build instructions, and test execution examples for standalone and HyperShift clusters.

  • New Features

  • Added dual-stack configuration detection and support for Network Load Balancers.

  • Added test-skipping capability for management-cluster-dependent tests in HyperShift environments.

  • Bug Fixes

  • Improved AWS region resolution with fallback to cluster Infrastructure resource.

  • Fixed regional endpoint handling for AWS client connectivity from management clusters.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio mtulio changed the title OCPBUGS-86299: e2e/ccm-aws-ote: evaluate DualStack scenarios OCPBUGS-86299: skip upstream LB tests on dual-stack and patch NLB service IPFamilyPolicy May 21, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@mtulio: This pull request references Jira Issue OCPBUGS-86299, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

Upstream cloud-provider-aws load balancer tests do not support dual-stack clusters — they create single-stack NLB services that fail when the cluster network requires dual-stack IPFamilyPolicy. This PR detects dual-stack configuration from the CCM cloud-config and adapts the test suite accordingly.

Changes

  • Exclude upstream LB tests on dual-stack clusters: Detect dual-stack at startup by reading the ipFamilies key from the cloud-config ConfigMap. When the cluster is dual-stack, upstream [cloud-provider-aws-e2e] loadbalancer tests are excluded from the spec selector. When detection fails, upstream LB tests are also excluded (fail-closed) to avoid false positives.

  • Patch NLB services for dual-stack: The downstream createServiceNLB helper now sets IPFamilyPolicy=RequireDualStack on NLB services when dual-stack is detected, so the AWSServiceLBNetworkSecurityGroup tests create services matching the cluster's network configuration.

  • Add IsDualStack helper: New common.IsDualStack() function parses the ipFamilies config key from the cloud-config ConfigMap and returns true when both IPv4 and IPv6 are present.

Dependencies

Blocked by #464 — this PR builds on top of the HyperShift e2e fixes which introduce GetCloudConfig(), IsConfigPresentCloudConfig(), and the topology-aware cloud-config retrieval required by the dual-stack detection.

Test plan

  • Verify upstream LB tests are excluded on dual-stack clusters (list tests should not contain [cloud-provider-aws-e2e] loadbalancer entries)
  • Verify upstream LB tests are included on single-stack clusters
  • Verify AWSServiceLBNetworkSecurityGroup tests create dual-stack NLB services on dual-stack clusters
  • Verify no behavior change on single-stack standalone and HyperShift clusters

🤖 Generated with Claude Code


Summary by CodeRabbit

  • Documentation

  • Updated AWS Cloud Controller Manager e2e test guide with prerequisites, build instructions, and test execution examples for standalone and HyperShift clusters.

  • New Features

  • Added dual-stack configuration detection and support for Network Load Balancers.

  • Added test-skipping capability for management-cluster-dependent tests in HyperShift environments.

  • Bug Fixes

  • Improved AWS region resolution with fallback to cluster Infrastructure resource.

  • Fixed regional endpoint handling for AWS client connectivity from management clusters.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 21, 2026

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 21, 2026

@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/dfe82140-5551-11f1-976e-fc5230192bd3-0

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
openshift-tests/ccm-aws-tests/main.go (1)

50-65: 💤 Low value

Consider using a context with timeout for dual-stack detection.

The detection uses context.TODO() which has no timeout. If the cluster is unreachable or slow, this could block indefinitely during test initialization. A bounded context would make startup behavior more predictable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift-tests/ccm-aws-tests/main.go` around lines 50 - 65, Replace the
unbounded context.TODO() used in the dual-stack detection call chain with a
bounded context: create a context with timeout (e.g.
context.WithTimeout(context.Background(), 30*time.Second)), defer its cancel,
and pass that context into common.GetCloudConfig instead of context.TODO();
ensure error handling/log messages for the LoadConfig -> kclientset.NewForConfig
-> common.GetCloudConfig -> common.IsDualStack sequence still log the error
(including timeout) and that isDualStackCluster and dualStackDetectionReady are
only set when detection completes successfully, and keep the existing
log.Debugf("Dual-stack cluster detected: %v", isDualStackCluster) after
successful detection.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/dev/e2e-ote-ccm-aws.md`:
- Around line 166-168: The quoted test name in the HyperShift run example passed
to the run-test command is missing a closing quote around Managed; update the
test string used in the
./openshift-tests/bin/cloud-controller-manager-aws-tests-ext run-test invocation
(the long bracketed test selector starting with
"[cloud-provider-aws-e2e-openshift] ... with 'Managed value in cloud-config") to
close the single quote around Managed (e.g., "with 'Managed' value in
cloud-config ...") so the quoted test name is well-formed.

In `@openshift-tests/ccm-aws-tests/e2e/common/helper.go`:
- Around line 161-164: The error returned when fetching the ConfigMap via
mgmtClient.CoreV1().ConfigMaps(hcpNamespace).Get(...) drops the underlying err;
update the return to include/wrap the original error (e.g., append ": %w" or
include err with %v) so the failure from Get is preserved and actionable; change
the fmt.Errorf call that currently formats "failed to get HCP cloud-config
ConfigMap %s/%s" to include the err and wrap it using the original err variable
(referencing hcpNamespace and hcpCloudConfigName to locate the site).
- Around line 151-159: The error returns for BuildConfigFromFlags and
NewForConfig discard the underlying error; update the error wrapping in the
blocks where restConfig is built (clientcmd.BuildConfigFromFlags with
mgmtKubeconfig) and where mgmtClient is created (clientset.NewForConfig) to
include the original err (e.g. use fmt.Errorf with %w or include %v) so callers
see the underlying failure details; locate the two error branches around
restConfig and mgmtClient and wrap or append err to the returned fmt.Errorf
messages.

---

Nitpick comments:
In `@openshift-tests/ccm-aws-tests/main.go`:
- Around line 50-65: Replace the unbounded context.TODO() used in the dual-stack
detection call chain with a bounded context: create a context with timeout (e.g.
context.WithTimeout(context.Background(), 30*time.Second)), defer its cancel,
and pass that context into common.GetCloudConfig instead of context.TODO();
ensure error handling/log messages for the LoadConfig -> kclientset.NewForConfig
-> common.GetCloudConfig -> common.IsDualStack sequence still log the error
(including timeout) and that isDualStackCluster and dualStackDetectionReady are
only set when detection completes successfully, and keep the existing
log.Debugf("Dual-stack cluster detected: %v", isDualStackCluster) after
successful detection.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ec60e4a6-62e6-4798-a902-08de766c83c7

📥 Commits

Reviewing files that changed from the base of the PR and between 7f6aa93 and 1d39590.

📒 Files selected for processing (7)
  • docs/dev/e2e-ote-ccm-aws.md
  • docs/dev/ote-ccm-aws.md
  • openshift-tests/ccm-aws-tests/e2e/aws/helper.go
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/go.mod
  • openshift-tests/ccm-aws-tests/main.go
💤 Files with no reviewable changes (1)
  • docs/dev/ote-ccm-aws.md

Comment thread docs/dev/e2e-ote-ccm-aws.md Outdated
Comment thread openshift-tests/ccm-aws-tests/e2e/common/helper.go Outdated
Comment thread openshift-tests/ccm-aws-tests/e2e/common/helper.go Outdated
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 26, 2026

Fixed the dualstack check from config, collecting data sets from CI again after running successfully locally in a dualstack cluster:

$ oc get cm -n openshift-cloud-controller-manager cloud-conf -o yaml | grep NodeIPFamilies
    NodeIPFamilies                                  = ipv6
    NodeIPFamilies                                  = ipv4

$ run_test(){
while IFS= read -r t; do
     echo "=== Running: $t";
     $BIN run-test "$t" < /dev/null;
 done < <($BIN list tests | grep -v I0526 | jq -r '.[].name' | grep "AWSServiceLBNetworkSecurityGroup"); }
$ run_test | tee -a e2e-ote.log

$ grep -E "(name\"\:|\"result\")" e2e-ote.log
grep: warning: stray \ before :
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have NLBSecurityGroupMode with 'Managed value in cloud-config [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should create NLB service with security group attached [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have security groups attached to default ingress controller NLB [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should update security group rules when service is updated [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should cleanup security groups when service is deleted [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have correct security group rules for service ports [Suite:openshift/conformance/parallel]",
    "result": "passed",

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5d7de3c0-5916-11f1-825a-985e881b83e7-0

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openshift-tests/ccm-aws-tests/e2e/common/helper.go`:
- Around line 263-268: The switch that checks the IP family string (switch v {
case "ipv6": ... case "ipv4": ... }) assigns hasIPv4 and hasIPv6 backwards;
change the case bodies so case "ipv6" sets hasIPv6 = true and case "ipv4" sets
hasIPv4 = true (leave the variable names hasIPv4 and hasIPv6 and the switch on v
unchanged).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c3489038-5d9a-40b5-89d2-5a570b967712

📥 Commits

Reviewing files that changed from the base of the PR and between 1d39590 and 48d1685.

📒 Files selected for processing (4)
  • openshift-tests/ccm-aws-tests/e2e/aws/helper.go
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/main.go

Comment thread openshift-tests/ccm-aws-tests/e2e/common/helper.go Outdated
Comment thread openshift-tests/ccm-aws-tests/main.go Outdated
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

@tthvo: This PR was included in a payload test run from openshift/origin#31217
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/82f96140-593b-11f1-9177-54846f2a0e77-0

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

@tthvo: This PR was included in a payload test run from openshift/origin#31217
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv4-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8afa2d70-593b-11f1-94c3-bb46dcb523ca-0

@tthvo
Copy link
Copy Markdown
Member

tthvo commented May 26, 2026

/cc @sadasu @nrb @tthvo

@openshift-ci openshift-ci Bot requested review from nrb, sadasu and tthvo May 26, 2026 19:46
mtulio added a commit to mtulio/cluster-cloud-controller-manager-operator that referenced this pull request May 26, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
openshift-tests/ccm-aws-tests/main.go (1)

53-65: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Honor the fail-closed path when dual-stack detection is unavailable.

Because dualStackDetectionReady is never consulted after Line 63, any error in Lines 53-60 leaves isDualStackCluster == false, and Line 90 still appends the upstream loadbalancer selector. That means detection failures will run the unsupported upstream LB suite instead of skipping it.

Suggested fix
-	if isDualStackCluster {
-		framework.Logf("Dual-stack environment detected, skipping test name that contains '[cloud-provider-aws-e2e] loadbalancer'")
-	} else {
+	if !dualStackDetectionReady {
+		framework.Logf("Dual-stack detection was not available, skipping test name that contains '[cloud-provider-aws-e2e] loadbalancer'")
+	} else if isDualStackCluster {
+		framework.Logf("Dual-stack environment detected, skipping test name that contains '[cloud-provider-aws-e2e] loadbalancer'")
+	} else {
 		specSelectors = append(specSelectors, extensiontests.NameContains("[cloud-provider-aws-e2e] loadbalancer"))
 	}

Also applies to: 87-90

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift-tests/ccm-aws-tests/main.go` around lines 53 - 65, The dual-stack
detection block (using framework.LoadConfig, kclientset.NewForConfig,
common.GetCloudConfig, common.IsDualStack) currently leaves isDualStackCluster
false on errors but still allows later code to behave as if detection succeeded;
change the logic so that dualStackDetectionReady is only set true when detection
completes successfully and ensure downstream code that appends the upstream
loadbalancer selector checks dualStackDetectionReady before using
isDualStackCluster (i.e., do not append the upstream LB selector unless
dualStackDetectionReady == true and isDualStackCluster == true); apply the same
check in the other occurrence mentioned (the append block around upstream LB
handling) so detection failures skip the unsupported upstream LB suite rather
than defaulting to non-dual-stack behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@openshift-tests/ccm-aws-tests/main.go`:
- Around line 53-65: The dual-stack detection block (using framework.LoadConfig,
kclientset.NewForConfig, common.GetCloudConfig, common.IsDualStack) currently
leaves isDualStackCluster false on errors but still allows later code to behave
as if detection succeeded; change the logic so that dualStackDetectionReady is
only set true when detection completes successfully and ensure downstream code
that appends the upstream loadbalancer selector checks dualStackDetectionReady
before using isDualStackCluster (i.e., do not append the upstream LB selector
unless dualStackDetectionReady == true and isDualStackCluster == true); apply
the same check in the other occurrence mentioned (the append block around
upstream LB handling) so detection failures skip the unsupported upstream LB
suite rather than defaulting to non-dual-stack behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1ebae86b-b6e7-485c-a091-059d5ca08572

📥 Commits

Reviewing files that changed from the base of the PR and between 48d1685 and 3061a53.

📒 Files selected for processing (3)
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/main.go
💤 Files with no reviewable changes (1)
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go

mtulio added a commit to mtulio/cluster-cloud-controller-manager-operator that referenced this pull request May 26, 2026
@mtulio mtulio force-pushed the OCPBUGS-86299 branch 3 times, most recently from 4fe49ef to ad1461f Compare May 26, 2026 21:36
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openshift-tests/ccm-aws-tests/main.go`:
- Around line 53-57: The code ignores errors from common.IsDualStack and thus
can “fail-open”; change the detection flow so that when common.GetCloudConfig or
common.IsDualStack returns an error you set dualStackDetectionReady = false (and
do not set isDualStackCluster), logging the error, and only treat dual-stack as
present if dualStackDetectionReady == true && isDualStackCluster == true; update
the later conditional that currently checks only isDualStackCluster to require
dualStackDetectionReady as well so upstream LB specs are included only when
detection succeeded.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 68360838-876c-4b19-b1fb-c1aeaa10e6fe

📥 Commits

Reviewing files that changed from the base of the PR and between 3061a53 and 4fe49ef.

📒 Files selected for processing (4)
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/go.mod
  • openshift-tests/ccm-aws-tests/main.go

Comment thread openshift-tests/ccm-aws-tests/main.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
openshift-tests/ccm-aws-tests/main.go (1)

53-57: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail-closed dual-stack gating is still bypassed on detection errors.

Line 56 discards common.IsDualStack errors and Line 81 gates only on isDualStackCluster. This can still include upstream LB specs when detection fails.

Suggested fix
-	if cm, err := common.GetCloudConfig(context.TODO(), nil); err != nil {
-		log.Debugf("failed to get cloud-config for dual-stack detection: %v", err)
-	} else {
-		isDualStackCluster, _ = common.IsDualStack(cm)
-		dualStackDetectionReady = true
-		log.Debugf("Dual-stack cluster detected: %v", isDualStackCluster)
-	}
+	if cm, err := common.GetCloudConfig(context.TODO(), nil); err != nil {
+		log.Warnf("failed to get cloud-config for dual-stack detection: %v", err)
+	} else if dual, err := common.IsDualStack(cm); err != nil {
+		log.Warnf("failed to detect dual-stack from cloud-config: %v", err)
+	} else {
+		isDualStackCluster = dual
+		dualStackDetectionReady = true
+		log.Debugf("Dual-stack cluster detected: %v", isDualStackCluster)
+	}
@@
-	if isDualStackCluster {
+	if !dualStackDetectionReady || isDualStackCluster {
 		framework.Logf("Dual-stack environment detected, skipping test name that contains '[cloud-provider-aws-e2e] loadbalancer'")
 	} else {
 		specSelectors = append(specSelectors, extensiontests.NameContains("[cloud-provider-aws-e2e] loadbalancer"))
 	}

Also applies to: 81-85

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift-tests/ccm-aws-tests/main.go` around lines 53 - 57, If cloud-config
or dual-stack detection returns an error, fail-closed: treat the cluster as
dual-stack and mark detection as ready so downstream gating prevents unsafe
upstream LB specs. Concretely, in the block that calls
common.GetCloudConfig(...) and common.IsDualStack(...), do not discard errors
from common.IsDualStack; if IsDualStack returns an error, set isDualStackCluster
= true and dualStackDetectionReady = true (otherwise set isDualStackCluster to
the returned value and dualStackDetectionReady = true only on success). Also
update the later gating logic that uses
isDualStackCluster/dualStackDetectionReady to rely on dualStackDetectionReady to
decide readiness and to treat any detection-error case as dual-stack based on
the variables above.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openshift-tests/ccm-aws-tests/e2e/common/helper.go`:
- Around line 190-198: The IsDualStack detection loop currently matches values
from the slice `values` case-sensitively and can miss "IPv4"/"IPv6" inputs;
before the switch in the loop that sets `hasIPv4` and `hasIPv6`, normalize each
`v` (e.g., strings.ToLower(v) or equivalent) and switch on the normalized value
so the cases "ipv4" and "ipv6" reliably match; update the loop that iterates
`for _, v := range values { ... switch v { ... } }` to switch on the lowercased
value.

---

Duplicate comments:
In `@openshift-tests/ccm-aws-tests/main.go`:
- Around line 53-57: If cloud-config or dual-stack detection returns an error,
fail-closed: treat the cluster as dual-stack and mark detection as ready so
downstream gating prevents unsafe upstream LB specs. Concretely, in the block
that calls common.GetCloudConfig(...) and common.IsDualStack(...), do not
discard errors from common.IsDualStack; if IsDualStack returns an error, set
isDualStackCluster = true and dualStackDetectionReady = true (otherwise set
isDualStackCluster to the returned value and dualStackDetectionReady = true only
on success). Also update the later gating logic that uses
isDualStackCluster/dualStackDetectionReady to rely on dualStackDetectionReady to
decide readiness and to treat any detection-error case as dual-stack based on
the variables above.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5d4bb127-87d1-4e4f-a8bb-afd1f7772f12

📥 Commits

Reviewing files that changed from the base of the PR and between 4fe49ef and ad1461f.

📒 Files selected for processing (4)
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/go.mod
  • openshift-tests/ccm-aws-tests/main.go

Comment thread openshift-tests/ccm-aws-tests/e2e/common/helper.go
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 26, 2026

/test all

OCPBUGS-86299:

Upstream cloud-provider-aws load balancer tests do not support dual-stack
clusters yet. Detect dual-stack configuration from the cloud-config
(`NodeIPFamilies` key) at startup and exclude upstream LB tests when the
cluster is dual-stack. When detection fails, upstream LB tests are also
excluded to avoid false positives (fail-closed).

For the downstream AWSServiceLBNetworkSecurityGroup tests, patch the
NLB service with IPFamilyPolicy=RequireDualStack when dual-stack is
detected so the service matches the cluster's network configuration.
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 27, 2026

Overall changes isolated from #464, and PRs can move in parallel - without dependencies.

AFAICT ipFamilyPolicy with value RequiredDualStack when ipFamilies has IPv4 first in the list won't be affected by this change as we are detecting the presence of DualStack config, driven by NodeIPFamilies entries, letting CCM set those defaults, so this patch only care about ipFamilyPolicy preventing being set as SingleStack when IPv6 are first in the list (primary IPv6). Checking this statement with the job:

/test all
/payload-job periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv4-primary-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 27, 2026

@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv4-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/b7ea5e90-5972-11f1-9f2b-ec634464a62f-0

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 27, 2026

Triggering primary IPv6 as well:

/payload-job periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 27, 2026

@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/11d2e120-5973-11f1-81d8-7538de235758-0

@mtulio mtulio marked this pull request as ready for review May 27, 2026 12:29
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 27, 2026
@openshift-ci openshift-ci Bot requested a review from damdo May 27, 2026 12:41
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 27, 2026

Tests are looking good, CI is reporting:

/verified by @mtulio on CI presubmit and payload/periodic jobs

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 27, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@mtulio: This PR has been marked as verified by @mtulio on CI presubmit and payload/periodic jobs.

Details

In response to this:

Tests are looking good, CI is reporting:

/verified by @mtulio on CI presubmit and payload/periodic jobs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio mtulio changed the title OCPBUGS-86299: skip upstream LB tests on dual-stack and patch NLB service IPFamilyPolicy OCPBUGS-86299: e2e/ccm-aws-ote: support to dual-stack IPv6 primary May 27, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@mtulio: This pull request references Jira Issue OCPBUGS-86299, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

Upstream cloud-provider-aws load balancer tests do not support dual-stack clusters — they create single-stack NLB services that fail when the cluster network requires dual-stack IPFamilyPolicy. This PR detects dual-stack configuration from the CCM cloud-config and adapts the test suite accordingly.

Changes

  • Exclude upstream LB tests on dual-stack IPv6-primary clusters: Detect dual-stack at startup by reading the NodeIPFamilies key from the cloud-config ConfigMap. When the cluster is dual-stack with IPv6 as primary family, upstream [cloud-provider-aws-e2e] loadbalancer tests are excluded from the spec selector since upstream does not support this configuration yet. IPv4-primary dual-stack and single-stack clusters continue to run the full upstream test suite.

  • Patch NLB services for dual-stack: The downstream createServiceNLB helper now sets IPFamilyPolicy=RequireDualStack on NLB services when dual-stack is detected, so the AWSServiceLBNetworkSecurityGroup tests create services matching the cluster's network configuration.

  • Consolidate cloud-config helpers into common package:

  • GetCloudConfig() retrieves the CCM cloud-config ConfigMap (supports being called from main.go outside Ginkgo context).

  • IsConfigPresentCloudConfig() parses INI-style key=value entries from cloud-config data, with comma-delimited multi-value support.

  • IsNLBSecurityGroupModeManaged() replaces inline regex checks for the NLB managed SG mode.

  • IsDualStack() returns (isDualStack, primaryIPv6, error) by parsing NodeIPFamilies.

  • GetOcClient() / GetKubeClient() extracted from IsFeatureEnabled() for reuse.

Test plan

  • Verify upstream LB tests are excluded on dual-stack IPv6-primary clusters (list-tests should not contain [cloud-provider-aws-e2e] loadbalancer entries)
  • Verify upstream LB tests are included on single-stack and IPv4-primary dual-stack clusters
  • Verify AWSServiceLBNetworkSecurityGroup tests create dual-stack NLB services on dual-stack clusters
  • Verify no behavior change on single-stack standalone and HyperShift clusters

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@nrb
Copy link
Copy Markdown
Contributor

nrb commented May 27, 2026

/lgtm
/approve

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 27, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 27, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mfbonfigli, nrb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 27, 2026
return false, nil
}
var hasIPv4, hasIPv6 bool
// ToDo: review rules and if order matter
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/openshift/installer/blob/main/pkg/asset/manifests/cloudproviderconfig.go#L119-L128 shows how the Installer generates the cloud-config and the order of values determines if it is DualStackIPv6Primary or DualStackIPv4Primary.

But, in this specific context, the upstream tests are skipped if DualStack is enabled and it doesn't matter which of the 2 types it is. So, I am OK with this logic that doesn't bother with the order.

}
if isDualStack {
framework.Logf("Detected DualStack clusters, patching Service setting IPFamilyPolicy to %q", v1.IPFamilyPolicyRequireDualStack)
dualStack := v1.IPFamilyPolicyRequireDualStack
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where the order /type of dual stack configuration matters.
If ipFamilies contains IPv4 first, then we set IPFamilyPolicy to IPFamilyPolicyPreferDualStack and if IPv6 is specified first, then IPFamilyPolicy is IPFamilyPolicyRequireDualStack.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 27, 2026

@mtulio: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit 0181640 into openshift:main May 27, 2026
13 checks passed
@openshift-ci-robot
Copy link
Copy Markdown

@mtulio: Jira Issue Verification Checks: Jira Issue OCPBUGS-86299
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-86299 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Summary

Upstream cloud-provider-aws load balancer tests do not support dual-stack clusters — they create single-stack NLB services that fail when the cluster network requires dual-stack IPFamilyPolicy. This PR detects dual-stack configuration from the CCM cloud-config and adapts the test suite accordingly.

Changes

  • Exclude upstream LB tests on dual-stack IPv6-primary clusters: Detect dual-stack at startup by reading the NodeIPFamilies key from the cloud-config ConfigMap. When the cluster is dual-stack with IPv6 as primary family, upstream [cloud-provider-aws-e2e] loadbalancer tests are excluded from the spec selector since upstream does not support this configuration yet. IPv4-primary dual-stack and single-stack clusters continue to run the full upstream test suite.

  • Patch NLB services for dual-stack: The downstream createServiceNLB helper now sets IPFamilyPolicy=RequireDualStack on NLB services when dual-stack is detected, so the AWSServiceLBNetworkSecurityGroup tests create services matching the cluster's network configuration.

  • Consolidate cloud-config helpers into common package:

  • GetCloudConfig() retrieves the CCM cloud-config ConfigMap (supports being called from main.go outside Ginkgo context).

  • IsConfigPresentCloudConfig() parses INI-style key=value entries from cloud-config data, with comma-delimited multi-value support.

  • IsNLBSecurityGroupModeManaged() replaces inline regex checks for the NLB managed SG mode.

  • IsDualStack() returns (isDualStack, primaryIPv6, error) by parsing NodeIPFamilies.

  • GetOcClient() / GetKubeClient() extracted from IsFeatureEnabled() for reuse.

Test plan

  • Verify upstream LB tests are excluded on dual-stack IPv6-primary clusters (list-tests should not contain [cloud-provider-aws-e2e] loadbalancer entries)
  • Verify upstream LB tests are included on single-stack and IPv4-primary dual-stack clusters
  • Verify AWSServiceLBNetworkSecurityGroup tests create dual-stack NLB services on dual-stack clusters

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants