Skip to content

MGMT-24393: Assisted Service IPv6 checks fails comparative#10340

Open
shay23bra wants to merge 2 commits into
openshift:masterfrom
shay23bra:MGMT-24393-Assisted-Service-IPv6-checks-fails-comparative
Open

MGMT-24393: Assisted Service IPv6 checks fails comparative#10340
shay23bra wants to merge 2 commits into
openshift:masterfrom
shay23bra:MGMT-24393-Assisted-Service-IPv6-checks-fails-comparative

Conversation

@shay23bra
Copy link
Copy Markdown
Contributor

@shay23bra shay23bra commented May 18, 2026

Summary

  • Normalize IPv6 addresses to canonical form at all VIP ingestion points (controller, REST API, DHCP) and in VIP comparison functions
  • PostgreSQL's inet column auto-normalizes IPv6 (e.g. 2620:0052:0009:164d:0000:0000:0000:0046 → 2620:52:9:164d::46), but Go string comparison treated them as different, causing infinite reconciliation loops
  • Post-install, this blocked kubeconfig/kubeadmin secret creation entirely

Changes

  • Added NormalizeIP() and ipsEqual() helpers in internal/network/utils.go
  • Updated AreApiVipsIdentical / AreIngressVipsIdentical to use inet-aware comparison
  • Normalized VIPs on ingestion in controller (common.go), REST API (inventory.go), DHCP path, and SetVipsData (cluster.go)
  • Normalized backwards-compat VIP comparisons in validations.go
  • Added unit tests covering expanded, leading-zero, mixed-case, and dual-stack IPv6 scenarios

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested). DB round-trip test - confirmed PostgreSQL normalizes IPv6 on storage regardless of input form — inserting canonical vs expanded produces identical DB values, so this fix has zero impact on stored data or downstream consumers.
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Bug Fixes

    • Improved VIP handling to prevent spurious update triggers caused by different IP address formatting of semantically identical addresses.
    • Enhanced IPv6 address normalization to ensure consistent IP comparison and storage across cluster registration, validation, and DHCP operations.
  • Tests

    • Extended test coverage for IPv6 VIP comparisons and address normalization scenarios.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 18, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 18, 2026

@shay23bra: This pull request references MGMT-24393 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Normalize IPv6 addresses to canonical form at all VIP ingestion points (controller, REST API, DHCP) and in VIP comparison functions
  • PostgreSQL's inet column auto-normalizes IPv6 (e.g. 2620:0052:0009:164d:0000:0000:0000:0046 → 2620:52:9:164d::46), but Go string comparison treated them as different, causing infinite reconciliation loops
  • Post-install, this blocked kubeconfig/kubeadmin secret creation entirely

Changes

  • Added NormalizeIP() and ipsEqual() helpers in internal/network/utils.go
  • Updated AreApiVipsIdentical / AreIngressVipsIdentical to use inet-aware comparison
  • Normalized VIPs on ingestion in controller (common.go), REST API (inventory.go), DHCP path, and SetVipsData (cluster.go)
  • Normalized backwards-compat VIP comparisons in validations.go
  • Added unit tests covering expanded, leading-zero, mixed-case, and dual-stack IPv6 scenarios

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested). DB round-trip test - confirmed PostgreSQL normalizes IPv6 on storage regardless of input form — inserting canonical vs expanded produces identical DB values, so this fix has zero impact on stored data or downstream consumers.
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ca33f05f-2a64-42aa-bd10-b7db643bea1d

📥 Commits

Reviewing files that changed from the base of the PR and between 963e8a7 and 5595291.

📒 Files selected for processing (6)
  • internal/bminventory/inventory.go
  • internal/cluster/cluster.go
  • internal/cluster/validations/validations.go
  • internal/controller/controllers/common.go
  • internal/network/utils.go
  • internal/network/utils_test.go

Walkthrough

This PR normalizes API and Ingress VIP IP addresses across the cluster management flow to prevent update detection and comparison issues caused by semantically-identical but differently-formatted IP strings (particularly IPv6 addresses in expanded vs canonical form).

Changes

IP Normalization for VIP Consistency

Layer / File(s) Summary
IP Normalization Functions and Comparisons
internal/network/utils.go
Adds NormalizeIP() function to canonicalize IP strings via netip.ParseAddr, introduces ipsEqual() helper for semantic IP comparison, and updates AreApiVipsIdentical() and AreIngressVipsIdentical() to use ipsEqual() in dual-stack and non-dual-stack list comparisons.
IP Normalization Test Coverage
internal/network/utils_test.go
Extends test coverage with new table-driven cases validating IPv6 canonicalization (expanded vs canonical form, leading zeros, mixed case), IPv4 and invalid input invariance, and dual-stack equivalence of semantically-identical VIPs.
VIP Normalization at Entry Points
internal/controller/controllers/common.go
Adds internal/network import and normalizes API and Ingress VIP IP strings via network.NormalizeIP() in ApiVipsEntriesToArray() and IngressVipsEntriesToArray() helpers before storing in model objects.
VIP Normalization in Cluster Operations
internal/bminventory/inventory.go, internal/cluster/cluster.go, internal/cluster/validations/validations.go
Normalizes API and Ingress VIPs during cluster registration, DHCP response processing, VIP update detection, VIP data updates, and backwards-compatibility validation using normalized IP comparisons.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested labels

size/M

Suggested reviewers

  • bluesort
  • tsorya
🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Title check ✅ Passed The title references MGMT-24393 and indicates the fix addresses IPv6 comparison failures, aligning with the changeset which normalizes IPv6 addresses to prevent reconciliation loop issues.
Description check ✅ Passed The PR description is comprehensive and complete. It clearly explains the issue (IPv6 normalization mismatch), root cause (PostgreSQL vs Go comparison), solution (normalize at ingestion points), testing approach, and includes all required template sections with appropriate selections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo test titles are stable and deterministic. 97 test titles examined contain only static descriptive text with no dynamic content like IP addresses, UUIDs, timestamps, or pod/node names.
Test Structure And Quality ✅ Passed Table-driven Ginkgo tests follow established patterns with single responsibility, proper cleanup (not needed for unit tests), and codebase consistency. Minor: assertions lack custom messages.
Microshift Test Compatibility ✅ Passed No e2e tests added. Test changes in utils_test.go are unit tests only, testing internal functions with mock data, with no cluster or OpenShift API interaction.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests added. Changes are limited to internal utility functions and unit tests in internal/network/utils_test.go. SNO check not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only Go source code for IPv6 normalization. No deployment manifests, operator code, or Kubernetes scheduling constraints are introduced.
Ote Binary Stdout Contract ✅ Passed PR contains only functional changes to IP normalization logic. No process-level code modifications, no stdout writes, no logging configuration. All changes isolated to regular utility functions.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR adds unit tests (not e2e tests) to internal/network/utils_test.go for helper functions like NormalizeIP and VIP comparison. The custom check targets e2e tests only, so it is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 18, 2026
@openshift-ci openshift-ci Bot requested review from linoyaslan and yoavsc0302 May 18, 2026 14:01
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 18, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shay23bra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 18, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 85.71429% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.32%. Comparing base (963e8a7) to head (5595291).
⚠️ Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
internal/cluster/validations/validations.go 0.00% 2 Missing ⚠️
internal/network/utils.go 86.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #10340   +/-   ##
=======================================
  Coverage   44.32%   44.32%           
=======================================
  Files         417      417           
  Lines       72762    72779   +17     
=======================================
+ Hits        32253    32262    +9     
- Misses      37589    37594    +5     
- Partials     2920     2923    +3     
Files with missing lines Coverage Δ
internal/bminventory/inventory.go 72.11% <100.00%> (+0.02%) ⬆️
internal/cluster/cluster.go 66.53% <100.00%> (+0.05%) ⬆️
internal/controller/controllers/common.go 76.49% <100.00%> (ø)
internal/cluster/validations/validations.go 46.21% <0.00%> (ø)
internal/network/utils.go 58.07% <86.66%> (+0.60%) ⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shay23bra
Copy link
Copy Markdown
Contributor Author

/retest

@linoyaslan
Copy link
Copy Markdown
Contributor

/test ?

@linoyaslan
Copy link
Copy Markdown
Contributor

/test edge-e2e-metal-assisted-dual-stack-primary-ipv6-4-22

@linoyaslan
Copy link
Copy Markdown
Contributor

linoyaslan commented May 19, 2026

can you please explain why we need to normalize the VIPs in the code, instead of only normalizing them during validation when checking equality? Why isn't it sufficient?

@shay23bra
Copy link
Copy Markdown
Contributor Author

shay23bra commented May 19, 2026

can you please explain why we need to normalize the VIPs in the code, instead of only normalizing them during validation when checking equality? Why isn't it sufficient?

Normalizing only at comparison time is fragile — VIPs are compared in multiple places across the codebase (AreApiVipsIdentical, AreIngressVipsIdentical, wereClusterVipsUpdated, SetVipsData, backwards-compat handlers), and any new comparison added in the future could reintroduce the bug if it forgets to normalize.
By normalizing at ingestion, the in-memory representation always matches what PostgreSQL stores. This means every string comparison in the codebase is safe by default — not just the ones we remembered to fix.
Similar to what we do with normalize CIDRs at ingestion with NormalizeCIDR().

@linoyaslan
Copy link
Copy Markdown
Contributor

Similar to what we do with normalize CIDRs at ingestion with NormalizeCIDR().

NormalizeCIDR() is using only on isMachineCidrEqualsToCalculatedCidr which I think is a different use case. There, one side is user input while the other is a calculated value. and the fact, it seems to only be used in isMachineCidrEqualsToCalculatedCidr, so maybe this actually exposes another similar bug LOL

This means every string comparison in the codebase is safe by default — not just the ones we remembered to fix.

this is only true if we can guarantee normalization at every ingestion path, which feels like the same maintenance problem, just shifted from comparison sites to write sites. In other words, we are not really eliminating the risk, only moving it, no? let's say a future developer adding a new VIP write path and forgetting to call NormalizeIP() could reintroduce the bug just as easily as a developer forgetting to normalize during comparison. what do you think?

@shay23bra
Copy link
Copy Markdown
Contributor Author

let's say a future developer adding a new VIP write path and forgetting to call NormalizeIP() could reintroduce the bug just as easily as a developer forgetting to normalize during comparison. what do you think?

well the PostgreSQL's inet column auto-normalizes IPv6 so writing will be fine, so it's just another "safety-net", what I added catches the case even if a future write path forgets to normalize.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 19, 2026

@shay23bra: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@linoyaslan
Copy link
Copy Markdown
Contributor

well the PostgreSQL's inet column auto-normalizes IPv6 so writing will be fine, so it's just another "safety-net", what I added catches the case even if a future write path forgets to normalize.

What I'm trying to say is that neither approach - normalizing at ingestion nor normalizing at comparison time is truly "safe by default". Both approaches still rely on future developers remembering to normalize when introducing a new flow. I'm not sure how likely that scenario is, but I don't really see a strong advantage in adding normalization at ingestion points throughout the code. On the other hand, I also don't think it's harmful, as long as the comparison itself is normalized. Bottom line, I feel the important part here is the comparison logic itself. But let's discuss it further in a call, as agreed offline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants