Skip to content

[OSPRH-19707] Prioritize surfacing warn/error severity sub-statuses#1596

Merged
openshift-merge-bot[bot] merged 1 commit into
openstack-k8s-operators:mainfrom
abays:OSPRH-19707
Sep 18, 2025
Merged

[OSPRH-19707] Prioritize surfacing warn/error severity sub-statuses#1596
openshift-merge-bot[bot] merged 1 commit into
openstack-k8s-operators:mainfrom
abays:OSPRH-19707

Conversation

@abays

@abays abays commented Sep 10, 2025

Copy link
Copy Markdown
Contributor

Add MirrorSubResourceCondition calls across all OpenStack service reconcilers to enable proper condition hierarchy and severity-based prioritization in the OpenStackControlPlane's "Ready" condition.

Key changes:

  • Add MirrorSubResourceCondition calls for better condition propagation from sub-resources to OpenStackControlPlane instance
  • Ensure higher severity conditions (warn/error) are properly surfaced in 'oc get osctlplane' output instead of being overwritten by later info-level events

This addresses the core issue where the last-processed sub-resource condition would overwrite higher-severity conditions from earlier in the reconcile loop, ensuring critical conditions are properly prioritized and visible.

Files modified: pkg/openstack/common.go and all 20 OpenStack service reconcilers (keystone.go, barbican.go, cinder.go, glance.go, nova.go, neutron.go, etc.)

Related to: https://issues.redhat.com/browse/OSPRH-19707

Co-authored-by: Claude claude@anthropic.com

@abays abays requested a review from dprince September 10, 2025 11:09
@openshift-ci openshift-ci Bot requested a review from rabi September 10, 2025 11:10

@olliewalsh olliewalsh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit, otherwise LGTM

Comment thread pkg/openstack/ovn.go
condition.RequestedReason,
condition.SeverityInfo,
corev1beta1.OpenStackControlPlaneOVNReadyRunningMessage))
// We want to mirror the condition of the highest priority from the OVN resources into the instance

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on line 70 log message is "OVN is ready", to be consistent should that change to "OVN ready condition is true"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's do that

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olliewalsh Just discovered that I missed the Galera processing logic. Going to fix that too before we merge.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes made. Also improved the Redis logic along with adding Galera and fixing the OVN nit.

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/9204559dd3cf485db2db386c2afb7b4e

✔️ openstack-k8s-operators-content-provider SUCCESS in 3h 23m 43s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 19m 00s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 37m 31s
✔️ adoption-standalone-to-crc-ceph-provider SUCCESS in 3h 08m 55s
openstack-operator-tempest-multinode FAILURE in 1h 45m 22s

@abays

abays commented Sep 12, 2025

Copy link
Copy Markdown
Contributor Author

Build failed (check pipeline). Post recheck (without leading slash) to rerun all jobs. Make sure the failure cause has been resolved before you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/9204559dd3cf485db2db386c2afb7b4e

✔️ openstack-k8s-operators-content-provider SUCCESS in 3h 23m 43s ✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 19m 00s ✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 37m 31s ✔️ adoption-standalone-to-crc-ceph-provider SUCCESS in 3h 08m 55s ❌ openstack-operator-tempest-multinode FAILURE in 1h 45m 22s

EDPM bootstrap job failure, unrelated to PR:

�[0;31mfatal: [compute-2]: FAILED! => {"changed": false, "cmd": ["rpm", "-V", "driverctl", "lvm2", "crudini", "jq", "nftables", "NetworkManager", "openstack-selinux", "python3-libselinux", "python3-pyyaml", "rsync", "tmpwatch", "sysstat", "iproute-tc", "ksmtuned", "systemd-container", "crypto-policies-scripts", "grubby", "sos"], "delta": "0:00:00.698319", "end": "2025-09-11 21:24:05.473027", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2025-09-11 21:24:04.774708", "stderr": "error: %verify(openstack-selinux-0.8.41-0.20250527163806.f19cf25.el9.noarch) scriptlet failed, exit status 1", "stderr_lines": ["error: %verify(openstack-selinux-0.8.41-0.20250527163806.f19cf25.el9.noarch) scriptlet failed, exit status 1"], "stdout": "Missing os-ovs!\nMissing os-swift!\nMissing os-nova!\nMissing os-neutron!\nMissing os-mysql!\nMissing os-glance!\nMissing os-rsync!\nMissing os-rabbitmq!\nMissing os-keepalived!\nMissing os-keystone!\nMissing os-haproxy!\nMissing os-ipxe!\nMissing os-redis!\nMissing os-cinder!\nMissing os-httpd!\nMissing os-gnocchi!\nMissing os-collectd!\nMissing os-virt!\nMissing os-dnsmasq!\nMissing os-octavia!\nMissing os-podman!\nMissing os-rsyslog!\nMissing os-barbican!\nMissing os-logrotate!\nMissing os-certmonger!\nMissing os-timemaster!\nMissing os-ceilometer!\nMissing os-net-config!\nMissing os-ovs-el9!\nFound 29 missing module(s).", "stdout_lines": ["Missing os-ovs!", "Missing os-swift!", "Missing os-nova!", "Missing os-neutron!", "Missing os-mysql!", "Missing os-glance!", "Missing os-rsync!", "Missing os-rabbitmq!", "Missing os-keepalived!", "Missing os-keystone!", "Missing os-haproxy!", "Missing os-ipxe!", "Missing os-redis!", "Missing os-cinder!", "Missing os-httpd!", "Missing os-gnocchi!", "Missing os-collectd!", "Missing os-virt!", "Missing os-dnsmasq!", "Missing os-octavia!", "Missing os-podman!", "Missing os-rsyslog!", "Missing os-barbican!", "Missing os-logrotate!", "Missing os-certmonger!", "Missing os-timemaster!", "Missing os-ceilometer!", "Missing os-net-config!", "Missing os-ovs-el9!", "Found 29 missing module(s)."]}�[0m

@abays

abays commented Sep 12, 2025

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/1a7b321514924bb9991fe0cb63ec0d09

✔️ openstack-k8s-operators-content-provider SUCCESS in 4h 56m 01s
podified-multinode-edpm-deployment-crc FAILURE in 1h 37m 49s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 25m 03s
adoption-standalone-to-crc-ceph-provider TIMED_OUT in 4h 38m 27s
openstack-operator-tempest-multinode FAILURE in 3h 00m 18s

@abays

abays commented Sep 12, 2025

Copy link
Copy Markdown
Contributor Author

Build failed (check pipeline). Post recheck (without leading slash) to rerun all jobs. Make sure the failure cause has been resolved before you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/1a7b321514924bb9991fe0cb63ec0d09

✔️ openstack-k8s-operators-content-provider SUCCESS in 4h 56m 01s ❌ podified-multinode-edpm-deployment-crc FAILURE in 1h 37m 49s ✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 25m 03s ❌ adoption-standalone-to-crc-ceph-provider TIMED_OUT in 4h 38m 27s ❌ openstack-operator-tempest-multinode FAILURE in 3h 00m 18s

Same error as last time:

�[0;31mfatal: [compute-0]: FAILED! => {"changed": false, "cmd": ["rpm", "-V", "driverctl", "lvm2", "crudini", "jq", "nftables", "NetworkManager", "openstack-selinux", "python3-libselinux", "python3-pyyaml", "rsync", "tmpwatch", "sysstat", "iproute-tc", "ksmtuned", "systemd-container", "crypto-policies-scripts", "grubby", "sos"], "delta": "0:00:04.678613", "end": "2025-09-12 10:16:57.063486", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2025-09-12 10:16:52.384873", "stderr": "error: %verify(openstack-selinux-0.8.41-0.20250527163806.f19cf25.el9.noarch) scriptlet failed, exit status 1", "stderr_lines": ["error: %verify(openstack-selinux-0.8.41-0.20250527163806.f19cf25.el9.noarch) scriptlet failed, exit status 1"], "stdout": "Missing os-ovs!\nMissing os-swift!\nMissing os-nova!\nMissing os-neutron!\nMissing os-mysql!\nMissing os-glance!\nMissing os-rsync!\nMissing os-rabbitmq!\nMissing os-keepalived!\nMissing os-keystone!\nMissing os-haproxy!\nMissing os-ipxe!\nMissing os-redis!\nMissing os-cinder!\nMissing os-httpd!\nMissing os-gnocchi!\nMissing os-collectd!\nMissing os-virt!\nMissing os-dnsmasq!\nMissing os-octavia!\nMissing os-podman!\nMissing os-rsyslog!\nMissing os-barbican!\nMissing os-logrotate!\nMissing os-certmonger!\nMissing os-timemaster!\nMissing os-ceilometer!\nMissing os-net-config!\nMissing os-ovs-el9!\nFound 29 missing module(s).", "stdout_lines": ["Missing os-ovs!", "Missing os-swift!", "Missing os-nova!", "Missing os-neutron!", "Missing os-mysql!", "Missing os-glance!", "Missing os-rsync!", "Missing os-rabbitmq!", "Missing os-keepalived!", "Missing os-keystone!", "Missing os-haproxy!", "Missing os-ipxe!", "Missing os-redis!", "Missing os-cinder!", "Missing os-httpd!", "Missing os-gnocchi!", "Missing os-collectd!", "Missing os-virt!", "Missing os-dnsmasq!", "Missing os-octavia!", "Missing os-podman!", "Missing os-rsyslog!", "Missing os-barbican!", "Missing os-logrotate!", "Missing os-certmonger!", "Missing os-timemaster!", "Missing os-ceilometer!", "Missing os-net-config!", "Missing os-ovs-el9!", "Found 29 missing module(s)."]}�[0m

I can't see how a failing EDPM Job has anything to do with my changes. Is something broken in CI?

@abays

abays commented Sep 12, 2025

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/openstack-operator for 1596,8b51b8bfa2f73ff6355950ec06701e2c745e2b72

@abays

abays commented Sep 12, 2025

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/openstack-operator for 1596,8b51b8bfa2f73ff6355950ec06701e2c745e2b72

@abays

abays commented Sep 15, 2025

Copy link
Copy Markdown
Contributor Author

Missed Memcached. Adding that in the next push.

@abays

abays commented Sep 15, 2025

Copy link
Copy Markdown
Contributor Author

@abays: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
Test name Commit Details Required Rerun command
ci/prow/openstack-operator-build-deploy-kuttl 5e4a816 link true /test openstack-operator-build-deploy-kuttl

Full PR test history. Your PR dashboard.

{  failed to wait for the created cluster claim to become ready: timed out waiting for the condition}

🤦‍♂️

/test openstack-operator-build-deploy-kuttl

@abays

abays commented Sep 15, 2025

Copy link
Copy Markdown
Contributor Author

/test openstack-operator-build-deploy-kuttl

@abays

abays commented Sep 15, 2025

Copy link
Copy Markdown
Contributor Author

@abays: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
Test name Commit Details Required Rerun command
ci/prow/openstack-operator-build-deploy-kuttl 5e4a816 link true /test openstack-operator-build-deploy-kuttl

Full PR test history. Your PR dashboard.

{  failed to wait for the created cluster claim to become ready: timed out waiting for the condition}

🤦‍♂️

/test openstack-operator-build-deploy-kuttl

@abays

abays commented Sep 15, 2025

Copy link
Copy Markdown
Contributor Author

@abays: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
Test name Commit Details Required Rerun command
ci/prow/openstack-operator-build-deploy-kuttl 5e4a816 link true /test openstack-operator-build-deploy-kuttl

Full PR test history. Your PR dashboard.

{  failed to wait for the created cluster claim to become ready: timed out waiting for the condition}

😠

/test openstack-operator-build-deploy-kuttl

@abays

abays commented Sep 16, 2025

Copy link
Copy Markdown
Contributor Author

@abays: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
Test name Commit Details Required Rerun command
ci/prow/openstack-operator-build-deploy-kuttl 5e4a816 link true /test openstack-operator-build-deploy-kuttl

Full PR test history. Your PR dashboard.

{  failed to wait for the created cluster claim to become ready: timed out waiting for the condition}

😖

/test openstack-operator-build-deploy-kuttl

@abays abays requested review from olliewalsh and stuggi September 16, 2025 14:17
Comment thread pkg/openstack/ovn.go Outdated
Add MirrorSubResourceCondition calls across all OpenStack service reconcilers
to enable proper condition hierarchy and severity-based prioritization in the
OpenStackControlPlane's "Ready" condition.

Key changes:
- Add MirrorSubResourceCondition calls for better condition propagation from
  sub-resources to OpenStackControlPlane instance
- Ensure higher severity conditions (warn/error) are properly surfaced in
  'oc get osctlplane' output instead of being overwritten by later info-level events

This addresses the core issue where the last-processed sub-resource condition
would overwrite higher-severity conditions from earlier in the reconcile loop,
ensuring critical conditions are properly prioritized and visible.

Files modified: pkg/openstack/common.go and all 20 OpenStack service reconcilers
(keystone.go, barbican.go, cinder.go, glance.go, nova.go, neutron.go, etc.)

Related to: https://issues.redhat.com/browse/OSPRH-19707

Co-authored-by: Claude <claude@anthropic.com>
@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/6fa12437c6134f5a88e63ea282a389ec

✔️ openstack-k8s-operators-content-provider SUCCESS in 3h 19m 47s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 18m 40s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 28m 54s
✔️ adoption-standalone-to-crc-ceph-provider SUCCESS in 3h 05m 06s
openstack-operator-tempest-multinode POST_FAILURE in 1h 42m 26s

@stuggi

stuggi commented Sep 18, 2025

Copy link
Copy Markdown
Contributor

recheck

@stuggi stuggi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci

openshift-ci Bot commented Sep 18, 2025

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abays, stuggi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit ed564ce into openstack-k8s-operators:main Sep 18, 2025
8 checks passed
karelyatin added a commit to karelyatin/openstack-operator that referenced this pull request Sep 25, 2025
Comment thread pkg/openstack/ovn.go
instance.Spec.Ovn.Template.OVNNorthd.DeepCopyInto(&OVNNorthd.Spec.OVNNorthdSpecCore)

OVNNorthd.Spec.ContainerImage = *version.Status.ContainerImages.OvnNorthdImage
OVNNorthd.Spec.ExporterImage = *getImg(version.Status.ContainerImages.OpenstackNetworkExporterImage, &missingImageDefault)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was incorrectly dropped, fixing with #1618

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants