Skip to content

🌱 fix(catalogd): follow-up fixes after HA PR merge#2679

Merged
openshift-merge-bot[bot] merged 1 commit intooperator-framework:mainfrom
tmshort:fix-catalogd-ha-readiness-more
Apr 29, 2026
Merged

🌱 fix(catalogd): follow-up fixes after HA PR merge#2679
openshift-merge-bot[bot] merged 1 commit intooperator-framework:mainfrom
tmshort:fix-catalogd-ha-readiness-more

Conversation

@tmshort
Copy link
Copy Markdown
Contributor

@tmshort tmshort commented Apr 28, 2026

Two fixes on top of the catalogd HA work (#2674) that was merged prematurely:

  1. serverutil: wrap http.Server.Serve error with the catalog listen address so the failure message is self-diagnosing in aggregated manager logs.

  2. e2e: fix CatalogdHA gate to override node-count check with actual catalogd replica count. The node-count heuristic fires on any multi-node cluster (e.g. OpenShift standard e2e on AWS), causing the @CatalogdHA scenario to run even when catalogd has only 1 replica, which makes the step "catalogd is ready to reconcile resources" fail with exit status 1. After olmNamespace is populated, query the catalogd deployment's spec.replicas and set the gate unconditionally to (replicas >= 2), overriding the earlier node-count result. For upgrade scenarios the detectOLMDeployment early-return keeps the node-count value intact.

Description

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

Copilot AI review requested due to automatic review settings April 28, 2026 15:49
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 28, 2026

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit eaa55aa
🔍 Latest deploy log https://app.netlify.com/projects/olmv1/deploys/69f115b3e3abcc0008c43e54
😎 Deploy Preview https://deploy-preview-2679--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Follow-up fixes to the catalogd HA work to improve diagnosability of server start failures and to make the @CatalogdHA e2e gate reflect actual catalogd replica count (instead of a coarse multi-node heuristic).

Changes:

  • Wrap http.Server.Serve errors with the configured catalog listen address for better aggregated log messages.
  • Refine CatalogdHA e2e gating by reading the catalogd Deployment’s .spec.replicas once olmNamespace is known.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
test/e2e/steps/hooks.go Overrides CatalogdHA gate based on catalogd Deployment replica count.
internal/catalogd/serverutil/serverutil.go Improves error context when the catalog server fails to serve.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/e2e/steps/hooks.go Outdated
@tmshort tmshort force-pushed the fix-catalogd-ha-readiness-more branch from d13c3ac to 1384f29 Compare April 28, 2026 16:10
Comment thread test/e2e/steps/hooks.go Outdated

// Refine CatalogdHA based on actual catalogd replica count now that
// olmNamespace is known. The node-count check above can fire on any
// multi-node cluster (e.g. OpenShift standard e2e AWS) even when catalogd
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please remove the e.g.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 67.98%. Comparing base (cadb7a7) to head (eaa55aa).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
internal/catalogd/serverutil/serverutil.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2679   +/-   ##
=======================================
  Coverage   67.98%   67.98%           
=======================================
  Files         144      144           
  Lines       10595    10595           
=======================================
  Hits         7203     7203           
  Misses       2870     2870           
  Partials      522      522           
Flag Coverage Δ
e2e 37.44% <0.00%> (ø)
experimental-e2e 52.92% <0.00%> (ø)
unit 53.51% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tmshort tmshort requested a review from grokspawn April 28, 2026 18:36
Two fixes on top of the catalogd HA work (operator-framework#2674):

1. serverutil: wrap http.Server.Serve error with the catalog listen address
   so the failure message is self-diagnosing in aggregated manager logs.

2. e2e: fix CatalogdHA gate to override node-count check with actual
   catalogd replica count.  The node-count heuristic fires on any multi-node
   cluster (e.g. OpenShift standard e2e on AWS), causing the @CatalogdHA
   scenario to run even when catalogd has only 1 replica, which makes the
   step "catalogd is ready to reconcile resources" fail with exit status 1.
   After olmNamespace is populated, query the catalogd deployment's
   spec.replicas and set the gate unconditionally to (replicas >= 2),
   overriding the earlier node-count result.  For upgrade scenarios the
   detectOLMDeployment early-return keeps the node-count value intact.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Todd Short <tshort@redhat.com>
@tmshort tmshort force-pushed the fix-catalogd-ha-readiness-more branch from 1384f29 to eaa55aa Compare April 28, 2026 20:16
Copilot AI review requested due to automatic review settings April 28, 2026 20:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@pedjak pedjak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Apr 29, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 29, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pedjak, rashmigottipati

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 29, 2026
@tmshort
Copy link
Copy Markdown
Contributor Author

tmshort commented Apr 29, 2026

/override codecov/patch

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 29, 2026

@tmshort: Overrode contexts on behalf of tmshort: codecov/patch

Details

In response to this:

/override codecov/patch

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot Bot merged commit b177be3 into operator-framework:main Apr 29, 2026
35 of 37 checks passed
@tmshort tmshort deleted the fix-catalogd-ha-readiness-more branch April 29, 2026 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants