Skip to content

Handle glance minor updates#1429

Closed
fmount wants to merge 1 commit into
openstack-k8s-operators:mainfrom
fmount:glance-updates
Closed

Handle glance minor updates#1429
fmount wants to merge 1 commit into
openstack-k8s-operators:mainfrom
fmount:glance-updates

Conversation

@fmount

@fmount fmount commented May 6, 2025

Copy link
Copy Markdown
Contributor

@fmount fmount requested review from dprince and stuggi May 6, 2025 13:53
@openshift-ci openshift-ci Bot requested review from frenzyfriday and slagle May 6, 2025 13:53
@openshift-ci

openshift-ci Bot commented May 6, 2025

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fmount
Once this PR has been reviewed and has the lgtm label, please assign stuggi for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment thread pkg/openstack/glance.go
// to -internal instead of the glance top level glanceType split
svcSelector = "tlGlanceAPI"
svcSelector = "tlGlanceAPI"
targetVersion = "v18.0.9"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about targetVersion: it might work for a downstream release but not for an upstream one!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add const https://github.com/openstack-k8s-operators/openstack-operator/blob/main/apis/core/v1beta1/openstackversion_types.go which reflects the mapping of FR releases to version (not sure if MR release would also be required)? with this we could just say here targetVersion < FR3? for upstream we also have those OPENSTACK_RELEASE_VERSION 0.2.0 (fr2 - https://github.com/openstack-k8s-operators/openstack-operator/blob/18.0-fr2/Makefile#L6) 0.3.0 (fr3 - https://github.com/openstack-k8s-operators/openstack-operator/blob/main/Makefile#L6). we'd only have to verify if this is properly handled in the update ci job.

@softwarefactory-project-zuul

Copy link
Copy Markdown

This change depends on a change that failed to merge.

Change openstack-k8s-operators/glance-operator#728 is needed.

@fmount

fmount commented May 6, 2025

Copy link
Copy Markdown
Contributor Author

@stuggi @dprince @abays I tried this locally and it does what we expect to do.
However there are a few assumptions that we might need to agree on first:

  1. I rely on semVer, it seems a good pattern to avoid reinventing the wheel. It also applies to the current downstream versioning (18.0.x-<>) so it results easy to compare without writing and maintaining our own library. The question is whether the delivery team follows this pattern or not, because it wouldn't make sense to write code that diverges with the assumptions made by the delivery team (/cc @yazug).
  2. Right now this code can't do the right thing upstream due to the fact that in glance.go I have a const defining the targetVersion I want to look at when I do the comparison. This value differs (a lot!) from upstream to downstream, and I'm not sure how (or from where) I can get a given (or previous) version without too many hacks in the code.

Overall giving the ability to compare two versions (deployed vs target) seems a good way to drive the deployment, and more functions that support algorithms against OpenStackVersion (e.g. sorting) can be added as needed.
Before we move forward my goal is to gather feedback on the approach.
The glance-operator patch seems doing the right thing based on the annotation that is passed to the top-level CR (and that is propagated to the underlying sub-resources), so this seems a good pattern.

@fmount fmount requested a review from abays May 6, 2025 14:03
@fmount fmount force-pushed the glance-updates branch from ba50016 to 52850f2 Compare May 6, 2025 14:05
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
@openshift-ci

openshift-ci Bot commented May 6, 2025

Copy link
Copy Markdown
Contributor

@fmount: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/precommit-check 52850f2 link true /test precommit-check
ci/prow/openstack-operator-build-deploy-kuttl 52850f2 link true /test openstack-operator-build-deploy-kuttl
ci/prow/openstack-operator-build-deploy-kuttl-4-18 52850f2 link true /test openstack-operator-build-deploy-kuttl-4-18

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Comment thread pkg/openstack/glance.go
//
// - wsgi=false keeps the httpd + proxypass deployment method.
// This is required when a minor update from a version < FR3 is performed
apiAnnotations["wsgi"] = GlanceWSGIAnnotation(version, targetVersion)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned on the glance pr, lets use a const on the glance api module to not have to hard code something here.

Comment thread pkg/openstack/glance.go
// to -internal instead of the glance top level glanceType split
svcSelector = "tlGlanceAPI"
svcSelector = "tlGlanceAPI"
targetVersion = "v18.0.9"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add const https://github.com/openstack-k8s-operators/openstack-operator/blob/main/apis/core/v1beta1/openstackversion_types.go which reflects the mapping of FR releases to version (not sure if MR release would also be required)? with this we could just say here targetVersion < FR3? for upstream we also have those OPENSTACK_RELEASE_VERSION 0.2.0 (fr2 - https://github.com/openstack-k8s-operators/openstack-operator/blob/18.0-fr2/Makefile#L6) 0.3.0 (fr3 - https://github.com/openstack-k8s-operators/openstack-operator/blob/main/Makefile#L6). we'd only have to verify if this is properly handled in the update ci job.

@compi-migui

Copy link
Copy Markdown
Contributor

Is there some way we can directly determine whether the glance image in the existing deployment supports WSGI, rather than using deployedVersion as a proxy for it? I worry about a few things with this approach:

  1. If the bugfix/FR sequence changes, the hardcoded version becomes invalid and we have to go back and change it, breaking Glance on update if we forget to do that. E.g. we end up having an extra bugfix release called 18.0.9 and FR3 becomes 18.0.10, on update the deployedVersion>=v18.0.9 check succeeds.
  2. It relies on operator and services changes being synchronized enough, which isn't guaranteed. E.g. the 18.0.9/FR3 glance image ends up not including mod_wsgi (addition postponed to FR4, change doesn't promote on time, etc.); on greenfield deployment the default wsgi mode is still used which breaks.
  3. It assumes that OpenStackVersion strictly follows semver.

1 and 2 I see as bigger problems as they impose new constraints on the program and can get very unwieldy as we add more of these version checks for other features. If we could come up with an approach to feature gates that actually checks for support (does the deployed image have the bits needed?) rather than a version check, we'd have a lot more flexibility and we'd avoid some of the pain points we ran into with FR1 and FR2.

3 isn't too big a deal -- I've checked and all OpenStackVersion values we've used in the past seem to meet what the semver package expects (except for the v prefix which you're already taking care of). We'd just need to add some validation to make sure we don't overstep that new constraint in the future.

@openshift-merge-robot

Copy link
Copy Markdown
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@fmount

fmount commented May 8, 2025

Copy link
Copy Markdown
Contributor Author

Is there some way we can directly determine whether the glance image in the existing deployment supports WSGI, rather than using deployedVersion as a proxy for it? I worry about a few things with this approach:

Yes, the alternative is to put some logic in the Service operators and do some "image discovery", and take decisions to gracefully handle minor updates. I had a POC about that [1] and is based on a detect-version job that is run after DBSync. However, because openstack-operator has an OpenStackVersion object and it represents the "source of truth" for the service operators, we decided to:

  1. Implement a compatibility layer in the service operators (in the case of Glance it means we give it the ability to either deploy in the WSGI mode or in the ProxyPass one)

  2. Drive the deployment mode through annotations that can be passed to the service top-level CR: this mechanism seems generic enough to cover any potential use case from the openstack-operator point of view

  3. Rely on OpenStackVersion to decide whether the annotation should be set or not (other than which one should be set).

tl;dr, we need to put some logic in the OpenStackVersion area to compare OpenStack versions, and it seems reasonable enough other than distributing the logic across service operators.

1. If the bugfix/FR sequence changes, the hardcoded version becomes invalid and we have to go back and change it, breaking Glance on update if we forget to do that. E.g. we end up having an extra bugfix release called 18.0.9 and FR3 becomes 18.0.10, on update the `deployedVersion>=v18.0.9` check succeeds.

If we introduce this mechanism, maintaining the version mapping is our responsibility, and what you described just represents a bug that should be fixed in that context (I assume CI would catch the issue).
I think we can find a better mechanism to get the version mapping and load into the operator, rather than hardcoding it, we'll follow up about it.
But in general I think if we define a strict set of rules (e.g. follow semVer), we shouldn't break it and change the sequence unless something really wrong is happening! We always need to provide a +1 for minor and patches, and establish a pattern for hotfixes (probably based on a dedicated OLM channel when it's about operators, but we can still keep using the same semVer based approach for async and hotfixes).
What I'm -1 about is the fact that we might admit the possibility to "change the sequence": we can always decide to have FR3 on 18.0.9 instead of 18.0.8 (it's just a pure example), and this would just be an implementation detail, but the sequence pattern shouldn't be altered and it must be something predictable by the operator that implements am OpenstackVersion mechanism.

2. It relies on operator and services changes being synchronized enough, which isn't guaranteed. E.g. the 18.0.9/FR3 glance image ends up not including `mod_wsgi` (addition postponed to FR4, change doesn't promote on time, etc.); on greenfield deployment the default `wsgi` mode is still used which breaks.

It's not guaranteed, but this is the reason that the service operator provides a compatibility layer to make sure it works with both an old and a newer ContainerImage. Glance has a glance.openstack.org/wsgi annotation that drives the default behavior, so we can mark the new deployment method as experimental or tech preview until the image provides the required bits.
tcib provides images, and I already see the new image being available both upstream and downstream, even though we're not in a FR3 release yet.
I clearly need to pay attention at the synchronization between ContainerImage, service-operator and OpenStackVersion, but I need a predictable, in place framework that helps me with that, otherwise I can only build creative and error prone solutions.

3. It assumes that `OpenStackVersion` strictly follows `semver`.

1 and 2 I see as bigger problems as they impose new constraints on the program and can get very unwieldy as we add more of these version checks for other features. If we could come up with an approach to feature gates that actually checks for support (does the deployed image have the bits needed?) rather than a version check, we'd have a lot more flexibility and we'd avoid some of the pain points we ran into with FR1 and FR2.

3 isn't too big a deal -- I've checked and all OpenStackVersion values we've used in the past seem to meet what the semver package expects (except for the v prefix which you're already taking care of). We'd just need to add some validation to make sure we don't overstep that new constraint in the future.

+1, on this particular bit my only problem is the difference between upstream and downstream: it would be great to define a pattern to keep semVer aligned.
@compi-migui sorry for the long writing, I think we need to think more about this but thank you for the early feedback.

[1] openstack-k8s-operators/glance-operator@main...fmount:glance-operator:legacy#diff-0ee9c4aa5cfb7fc0a4f7bca6de203984f4fc16f21e42a431684f7c3e05249c52

@fmount

fmount commented May 9, 2025

Copy link
Copy Markdown
Contributor Author

After some discussion with the core team, I think the new direction is to still rely on OpenStackVersion, but to provide a new field in the .Status that determines how glance is deployed. This way we decouple the need of deploying in a mode or the other from the version numbering, and we can use this parameter to add or update the annotation to the Glance top-level CR.
I'm basically holding this patch until we have a way to consume this parameter from openstackVersion (@dprince will help with that part).
/cc @compi-migui

@openshift-ci openshift-ci Bot requested a review from compi-migui May 9, 2025 10:31
@fmount fmount closed this May 19, 2025
@fmount

fmount commented May 19, 2025

Copy link
Copy Markdown
Contributor Author

Closed in favor of #1439

@gibizer

gibizer commented May 30, 2025

Copy link
Copy Markdown
Contributor

I discussed a lot with @fmount both on the related openstack-k8s-operators/dev-docs#139 and on Slack. I strongly believe that doing proper semver versioning on OpenStackVersion and using that as feature gate is the best approach across the investigated alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants