Skip to content

[golang] bump to golang 1.24#1007

Merged
openshift-merge-bot[bot] merged 3 commits into
openstack-k8s-operators:mainfrom
stuggi:golang_1.24
Sep 25, 2025
Merged

[golang] bump to golang 1.24#1007
openshift-merge-bot[bot] merged 3 commits into
openstack-k8s-operators:mainfrom
stuggi:golang_1.24

Conversation

@stuggi

@stuggi stuggi commented Aug 20, 2025

Copy link
Copy Markdown
Contributor
  • bump in go.mod (base and api)
  • bump go-toolset in Dockerfile
  • bump golang version and custom_image in github jobs ('.github/workflows')
  • Bump the golangci-lint version in the .pre-commit-config.yaml to v2.4.0
  • Bump build_root_image in .ci-operator.yaml to ci-build-root-golang-1.24-sdk-1.31 (if set)

To test on existing env, or after landed:

  • update golang to 1.24
  • delete current go.work* files
  • init go work files go work init
  • Delete tools in bin/ subdir of the repo

Depends-On: openstack-k8s-operators/install_yamls#1082
Depends-On: openstack-k8s-operators/openstack-operator#1567

Jira: OSPRH-12935

@softwarefactory-project-zuul

Copy link
Copy Markdown

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/nova-operator for 1007,0b41156950b25c71b06e7d373a6ee83a033d6303

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f9552e8896a94a25ad4b297d5b593087

✔️ openstack-meta-content-provider SUCCESS in 2h 58m 50s
nova-operator-kuttl RETRY_LIMIT in 26m 35s
✔️ nova-operator-tempest-multinode SUCCESS in 2h 04m 01s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 33m 50s

@mrkisaolamb mrkisaolamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @stuggi looks good, but based on kuttl test error looks like we need to depends on some ci-framework changes to pass tests

@gibizer gibizer left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrkisaolamb @stuggi have you run some repeated envtest locally to see if we did not uncover new instabilities by these changes?

Comment thread test/functional/base_test.go Outdated
SecretName = "external-secret"
ContainerImage = "test://nova"
timeout = 25 * time.Second
timeout = 30 * time.Second

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, that feels like there were some failing tests before... :) Lets hope it is just real slowdown and not a race conditions somewhere

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it related to the changed below about the keystone auth url reconfiguration test?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was taking a look at that, and it looks like a similar issue to the one we had in the past when we were unblocking the job: #834
. Unfortunately, I don’t think we ever got back to the root cause of why we failed there due to timeouts. If I remember correctly (though I don’t have proof), we ended up with a strange finalizer error because in the eventually statement we got stuck waiting for the correct state of the generations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it related to the changed below about the keystone auth url reconfiguration test?

no that was not related to the keystone reconfigure test.

The test which was failing for me was the test to move the finalizer:
should move the finalizer to a new MariaDBAccount when create is complete

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have identified the issue for the failing mariadbaccount switch test I see in this work. The issue is with the MariaDBAccount switch test:

$ make test GINKGO_ARGS="--focus='should move the finalizer to a new MariaDBAccount when create is complete'"

From the test logs we see the following:

  • update to use some-new-account and immediately resources get simulated to success:
  2025-09-16T16:13:52.839+0200  INFO    ---Test---      Simulated statefulset success   {"on": {"name":"3c50a756-b525-46bd-b0ee-e-api","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}}
  2025-09-16T16:13:52.842+0200  INFO    ---Test---      Simulated Job success   {"on": {"name":"3c50a756-b525-46bd-b0ee-e-cell0-conductor-db-sync","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}}
  2025-09-16T16:13:52.843+0200  INFO    Controllers.Nova        Applied new databasehostname hostname-for-nova-api.387ef225-2396-43a7-9aa2-c6571f2db3f2.svc to MariaDBDatabase nova-api {"controller": "nova", "controllerGroup": "nova.openstack.org", "controllerKind": "Nova", "Nova": {"name":"3c50a756-b525-46bd-b0ee-e","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}, "namespace": "387ef225-2396-43a7-9aa2-c6571f2db3f2", "na
me": "3c50a756-b525-46bd-b0ee-e", "reconcileID": "43f3b673-6c0f-40d5-850c-985b1b47a267"}
  2025-09-16T16:13:52.843+0200  INFO    Controllers.Nova        Successfully ensured MariaDBAccount some-new-account exists; database username is nova_cell0_c26c       {"controller": "nova", "controllerGroup": "nova.openstack.org", "controllerKind": "Nova", "Nova": {"name":"3c50a756-b525-46bd-b0ee-e","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}, "namespace": "387ef225-2396-43a7-9aa2-c6571f2db3f2", "name": "3c50a756-b
525-46bd-b0ee-e", "reconcileID": "43f3b673-6c0f-40d5-850c-985b1b47a267", "ObjectType": "*v1beta1.MariaDBAccount", "ObjectNamespace": "387ef225-2396-43a7-9aa2-c6571f2db3f2", "ObjectName": "some-new-account"}
  2025-09-16T16:13:52.843+0200  INFO    Controllers.Nova        Applied new databasehostname hostname-for-nova-cell0.387ef225-2396-43a7-9aa2-c6571f2db3f2.svc to MariaDBDatabase nova-cell0     {"controller": "nova", "controllerGroup": "nova.openstack.org", "controllerKind": "Nova", "Nova": {"name":"3c50a756-b525-46bd-b0ee-e","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}, "namespace": "387ef225-2396-43a7-9aa2-c6571f2db3
f2", "name": "3c50a756-b525-46bd-b0ee-e", "reconcileID": "43f3b673-6c0f-40d5-850c-985b1b47a267"}
  2025-09-16T16:13:52.845+0200  INFO    ---Test---      Simulated statefulset success   {"on": {"name":"3c50a756-b525-46bd-b0ee-e-cell0-conductor","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}}
  2025-09-16T16:13:52.846+0200  INFO    novacell-resource       default {"name": "3c50a756-b525-46bd-b0ee-e-cell0"}
  2025-09-16T16:13:52.847+0200  INFO    novacell-resource       validate update {"name": "3c50a756-b525-46bd-b0ee-e-cell0"}
  2025-09-16T16:13:52.847+0200  INFO    ---Test---      Simulated Job success   {"on": {"name":"3c50a756-b525-46bd-b0ee-e-cell0-cell-mapping","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}}
  2025-09-16T16:13:52.850+0200  INFO    ---Test---      Service should move to run fully off MariaDBAccount 387ef225-2396-43a7-9aa2-c6571f2db3f2/some-new-account and remove finalizer from 387ef225-2396-43a7-9aa2-c6571f2db3f2/some-old-account
  • but afterwards, we see the actual NovaConductor CR updated to use the new account and bein the generation 2:
  2025-09-16T16:13:52.856+0200  INFO    novaconductor-resource  validate update {"name": "3c50a756-b525-46bd-b0ee-e-cell0-conductor"}                                                                                
  2025-09-16T16:13:52.859+0200  INFO    novaconductor-resource  validate update {"diff": "  &v1beta1.NovaConductor{\n  \tTypeMeta: {Kind: \"NovaConductor\", APIVersion: \"nova.openstack.org/v1beta1\"},\n  \tObjectMeta: v1.ObjectMeta{\n  \t\t... // 4 identical fields\n  \t\tUID:               \"0c1149ab-f0d0-4bcc-a7c8-5329730c9dcb\",\n  \t\tResourceVersion:   \"342\",\n- \t\tGeneration:        1,\n+ \t\tGeneration:        2,
\n  \t\tCreationTimestamp: {Time: s\"2025-09-16 16:13:51 +0200 CEST\"},\n  \t\tDeletionTimestamp: nil,\n  \t\t... // 3 identical fields\n  \t\tOwnerReferences: {{APIVersion: \"nova.openstack.org/v1beta1\", Kind: \"NovaCell\", Name: \"3c50a756-b525-46bd-b0ee-e-cell0\", UID: \"fb853005-2a18-4ee9-b09f-128c4ce4903f\", ...}},\n  \t\tFinalizers:      {\"openstack.org/novaconductor\"},\n  \t\tManagedFields: []v1.ManagedFieldsEntry
{\n- \t\t\t{\n- \t\t\t\tManager:    \"functional.test\",\n- \t\t\t\tOperation:  \"Update\",\n- \t\t\t\tAPIVersion: \"nova.openstack.org/v1beta1\",\n- \t\t\t\tTime:       s\"2025-09-16 16:13:51 +0200 CEST\",\n- \t\t\t\tFieldsType: \"FieldsV1\",\n- \t\t\t\tFieldsV1:   s`{\"f:metadata\":{\"f:finalizers\":{\".\":{},\"v:\\\"openstack.org/novaconductor\\\"\":{}},\"f:ownerReferences\":{\".\":{},\"k:{\\\"uid\\\":\\\"fb853005`...,\n
- \t\t\t},\n  \t\t\t{Manager: \"functional.test\", Operation: \"Update\", APIVersion: \"nova.openstack.org/v1beta1\", Time: s\"2025-09-16 16:13:51 +0200 CEST\", ...},\n+ \t\t\t{\n+ \t\t\t\tManager:    \"functional.test\",\n+ \t\t\t\tOperation:  \"Update\",\n+ \t\t\t\tAPIVersion: \"nova.openstack.org/v1beta1\",\n+ \t\t\t\tTime:       s\"2025-09-16 16:13:52 +0200 CEST\",\n+ \t\t\t\tFieldsType: \"FieldsV1\",\n+ \t\t\t\tFieldsV
1:   s`{\"f:metadata\":{\"f:finalizers\":{\".\":{},\"v:\\\"openstack.org/novaconductor\\\"\":{}},\"f:ownerReferences\":{\".\":{},\"k:{\\\"uid\\\":\\\"fb853005`...,\n+ \t\t\t},\n  \t\t},\n  \t},\n  \tSpec: v1beta1.NovaConductorSpec{\n  \t\t... // 4 identical fields\n  \t\tAPIDatabaseAccount:   \"test-nova-api-account\",\n  \t\tAPIDatabaseHostname:  \"hostname-for-nova-api.387ef225-2396-43a7-9aa2-c6571f2db3f2.svc\",\n- \t\tCe
llDatabaseAccount:  \"some-old-account\",\n+ \t\tCellDatabaseAccount:  \"some-new-account\",\n  \t\tCellDatabaseHostname: \"hostname-for-nova-cell0.387ef225-2396-43a7-9aa2-c6571f2db3f2.svc\",\n  \t\tPreserveJobs:         false,\n  \t\t... // 5 identical fields\n  \t},\n  \tStatus: {Hash: {\"dbsync\": \"n685h67dhdfh8dhbbh688h66fhcfh5dchf4h5fbh5c9h697h9h5cchf8h65fh5c6\"..., \"input\": \"n59dh669h88h5fhf8h59dh646h548hd8h5c8hc9
h5bhcbh697h564h9ch7ch65bh\"...}, Conditions: {{Type: \"Ready\", Status: \"True\", LastTransitionTime: {Time: s\"2025-09-16 16:13:51 +0200 CEST\"}, Reason: \"Ready\", ...}, {Type: \"CronJobReady\", Status: \"True\", LastTransitionTime: {Time: s\"2025-09-16 16:13:51 +0200 CEST\"}, Reason: \"Ready\", ...}, {Type: \"DBSyncReady\", Status: \"True\", LastTransitionTime: {Time: s\"2025-09-16 16:13:51 +0200 CEST\"}, Reason: \"Ready
\", ...}, {Type: \"DeploymentReady\", Status: \"True\", LastTransitionTime: {Time: s\"2025-09-16 16:13:51 +0200 CEST\"}, Reason: \"Ready\", ...}, ...}, ReadyCount: 1, ObservedGeneration: 1, ...},\n  }\n"}
  2025-09-16T16:13:52.861+0200  INFO    Controllers.NovaConductor       Reconciling     {"controller": "novaconductor", "controllerGroup": "nova.openstack.org", "controllerKind": "NovaConductor", "NovaConductor": {"name":"3c50a756-b525-46bd-b0ee-e-cell0-conductor","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}, "namespace": "387ef225-2396-43a7-9aa2-c6571f2db3f2", "name": "3c50a756-b525-46bd-b0ee-e-cell0-conductor", "re
concileID": "62a16745-2e21-4709-a1ec-5af119c39973"}
  2025-09-16T16:13:52.861+0200  INFO    Controllers.NovaConductor       FOO: Generation 2 - ObservedGeneration 2        {"controller": "novaconductor", "controllerGroup": "nova.openstack.org", "controllerKind": "NovaConductor", "NovaConductor": {"name":"3c50a756-b525-46bd-b0ee-e-cell0-conductor","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}, "namespace": "387ef225-2396-43a7-9aa2-c6571f2db3f2", "name": "3c50a756-b525-4
6bd-b0ee-e-cell0-conductor", "reconcileID": "62a16745-2e21-4709-a1ec-5af119c39973"}
  • as a result the NovaConductor deployment won't get ready:
  2025-09-16T16:13:52.861+0200  INFO    Controllers.NovaCell    NovaConductor updated.  {"controller": "novacell", "controllerGroup": "nova.openstack.org", "controllerKind": "NovaCell", "NovaCell": {"name":"3c50a756-b525-46bd-b0ee-e-cell0","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}, "namespace": "387ef225-2396-43a7-9aa2-c6571f2db3f2", "name": "3c50a756-b525-46bd-b0ee-e-cell0", "reconcileID": "36488628-7c45-47c5-
b78b-8e21d23b517d"}
  2025-09-16T16:13:52.863+0200  INFO    Controllers.NovaConductor       Secret 3c50a756-b525-46bd-b0ee-e-cell0-conductor-config-data successfully reconciled - operation: updated       {"controller": "novaconductor", "controllerGroup": "nova.openstack.org", "controllerKind": "NovaConductor", "NovaConductor": {"name":"3c50a756-b525-46bd-b0ee-e-cell0-conductor","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}, "namespace
": "387ef225-2396-43a7-9aa2-c6571f2db3f2", "name": "3c50a756-b525-46bd-b0ee-e-cell0-conductor", "reconcileID": "62a16745-2e21-4709-a1ec-5af119c39973"}
  2025-09-16T16:13:52.867+0200  INFO    Controllers.NovaConductor       StatefulSet 3c50a756-b525-46bd-b0ee-e-cell0-conductor - updated {"controller": "novaconductor", "controllerGroup": "nova.openstack.org", "controllerKind": "NovaConductor", "NovaConductor": {"name":"3c50a756-b525-46bd-b0ee-e-cell0-conductor","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}, "namespace": "387ef225-2396-43a7-9aa2-c6571f2db3f2", "name
": "3c50a756-b525-46bd-b0ee-e-cell0-conductor", "reconcileID": "62a16745-2e21-4709-a1ec-5af119c39973"}
  2025-09-16T16:13:52.867+0200  INFO    Controllers.NovaConductor       Deployment is not ready {"controller": "novaconductor", "controllerGroup": "nova.openstack.org", "controllerKind": "NovaConductor", "NovaConductor": {"name":"3c50a756-b525-46bd-b0ee-e-cell0-conductor","namespace":"387ef225-2396-43a7-9aa2-c6571f2db3f2"}, "namespace": "387ef225-2396-43a7-9aa2-c6571f2db3f2", "name": "3c50a756-b525-46bd-b0ee-e-cell0-cond
uctor", "reconcileID": "62a16745-2e21-4709-a1ec-5af119c39973", "Status": {"observedGeneration":1,"replicas":1,"readyReplicas":1,"updatedReplicas":1,"availableReplicas":1}}

Comment thread test/functional/nova_reconfiguration_test.go
// Check if cell0 conductor has been updated with the new KeystoneAuthURL
conductor := GetNovaConductor(cell0.ConductorName)
// The KeystoneAuthURL should match the new endpoint we set
g.Expect(conductor.Spec.KeystoneAuthURL).To(Equal(newInternalEndpoint))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already check that in the loop below L1083. So I'm wondering about the reason why we need to check it before separately. Is there a race?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the issue is that we have to wait for the resources to have updated before we simulate them ready. otherwise we simulate the old object version and the validation bellow fail.

@mrkisaolamb

Copy link
Copy Markdown
Contributor

@mrkisaolamb @stuggi have you run some repeated envtest locally to see if we did not uncover new instabilities by these changes?

Yes, I was running it without increasing the number of concurrent executors. But I remember that your local machine was really good at producing uncommon errors :)

@gibizer

gibizer commented Sep 3, 2025

Copy link
Copy Markdown
Contributor

@mrkisaolamb @stuggi have you run some repeated envtest locally to see if we did not uncover new instabilities by these changes?

Yes, I was running it without increasing the number of concurrent executors. But I remember that your local machine was really good at producing uncommon errors :)

on my machine even running the current main in a clean repo fails 3 times out of 5 runs with:

  << Timeline

  [FAILED] Timed out after 250.000s.
  The function passed to Eventually failed at /home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1475 with:
  Expected
      <[]string | len:1, cap:4>: [
          "openstack.org/novanovncproxy-77f80725-29aa-4eb5-91fc-d-cell1-novncproxy",
      ]
  not to contain element matching
      <string>: openstack.org/novanovncproxy-77f80725-29aa-4eb5-91fc-d-cell1-novncproxy
  In [It] at: /home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1477 @ 09/03/25 15:46:39.737

  Full Stack Trace
    github.com/openstack-k8s-operators/nova-operator/test/functional.glob..func15.6.3()
    	/home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1477 +0x4fe
...

Summarizing 1 Failure:
  [FAIL] NovaNoVNCProxy controller when NovaNoVNCProxy is created with topology [It] updates lastAppliedTopology in NovaNoVNCProxy .Status
  /home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1477

Ran 390 of 390 Specs in 297.822 seconds
FAIL! -- 389 Passed | 1 Failed | 0 Pending | 0 Skipped

Note the 250 sec wait time before the test fails. That is a lot just to got a failure. When everything passes then it took around 85 sec to run the whole test set. This is also a reason why I'm not happy bumping the timeout in this PR, this 250 sec will be 300 suddenly.

Bottom line I'm not in the position to check how the new PR behaves as baseline is broken for me already.

I will file a Jira ticket to fix the envtest on main as it seems to be broken.

@gibizer

gibizer commented Sep 3, 2025

Copy link
Copy Markdown
Contributor

I will file a Jira ticket to fix the envtest on main as it seems to be broken.

Filed https://issues.redhat.com/browse/OSPRH-19625

@stuggi

stuggi commented Sep 15, 2025

Copy link
Copy Markdown
Contributor Author

Thanks @stuggi looks good, but based on kuttl test error looks like we need to depends on some ci-framework changes to pass tests

we have to first land the openstack-operator PR to make the kuttl tests to work.

@stuggi

stuggi commented Sep 15, 2025

Copy link
Copy Markdown
Contributor Author

@mrkisaolamb @stuggi have you run some repeated envtest locally to see if we did not uncover new instabilities by these changes?

Yes, I was running it without increasing the number of concurrent executors. But I remember that your local machine was really good at producing uncommon errors :)

on my machine even running the current main in a clean repo fails 3 times out of 5 runs with:

  << Timeline

  [FAILED] Timed out after 250.000s.
  The function passed to Eventually failed at /home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1475 with:
  Expected
      <[]string | len:1, cap:4>: [
          "openstack.org/novanovncproxy-77f80725-29aa-4eb5-91fc-d-cell1-novncproxy",
      ]
  not to contain element matching
      <string>: openstack.org/novanovncproxy-77f80725-29aa-4eb5-91fc-d-cell1-novncproxy
  In [It] at: /home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1477 @ 09/03/25 15:46:39.737

  Full Stack Trace
    github.com/openstack-k8s-operators/nova-operator/test/functional.glob..func15.6.3()
    	/home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1477 +0x4fe
...

Summarizing 1 Failure:
  [FAIL] NovaNoVNCProxy controller when NovaNoVNCProxy is created with topology [It] updates lastAppliedTopology in NovaNoVNCProxy .Status
  /home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1477

Ran 390 of 390 Specs in 297.822 seconds
FAIL! -- 389 Passed | 1 Failed | 0 Pending | 0 Skipped

Note the 250 sec wait time before the test fails. That is a lot just to got a failure. When everything passes then it took around 85 sec to run the whole test set. This is also a reason why I'm not happy bumping the timeout in this PR, this 250 sec will be 300 suddenly.

Bottom line I'm not in the position to check how the new PR behaves as baseline is broken for me already.

I will file a Jira ticket to fix the envtest on main as it seems to be broken.

yes, makes sense if we can address that separate. I just bumped it to make the test to pass in this PR. was not looking into if there is a general issue.

@stuggi

stuggi commented Sep 15, 2025

Copy link
Copy Markdown
Contributor Author

@mrkisaolamb @stuggi have you run some repeated envtest locally to see if we did not uncover new instabilities by these changes?

Yes, I was running it without increasing the number of concurrent executors. But I remember that your local machine was really good at producing uncommon errors :)

on my machine even running the current main in a clean repo fails 3 times out of 5 runs with:

  << Timeline

  [FAILED] Timed out after 250.000s.
  The function passed to Eventually failed at /home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1475 with:
  Expected
      <[]string | len:1, cap:4>: [
          "openstack.org/novanovncproxy-77f80725-29aa-4eb5-91fc-d-cell1-novncproxy",
      ]
  not to contain element matching
      <string>: openstack.org/novanovncproxy-77f80725-29aa-4eb5-91fc-d-cell1-novncproxy
  In [It] at: /home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1477 @ 09/03/25 15:46:39.737

  Full Stack Trace
    github.com/openstack-k8s-operators/nova-operator/test/functional.glob..func15.6.3()
    	/home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1477 +0x4fe
...

Summarizing 1 Failure:
  [FAIL] NovaNoVNCProxy controller when NovaNoVNCProxy is created with topology [It] updates lastAppliedTopology in NovaNoVNCProxy .Status
  /home/gibi/upstream/git/openstack-k8s-operators/nova-operator/test/functional/nova_novncproxy_test.go:1477

Ran 390 of 390 Specs in 297.822 seconds
FAIL! -- 389 Passed | 1 Failed | 0 Pending | 0 Skipped

Note the 250 sec wait time before the test fails. That is a lot just to got a failure. When everything passes then it took around 85 sec to run the whole test set. This is also a reason why I'm not happy bumping the timeout in this PR, this 250 sec will be 300 suddenly.
Bottom line I'm not in the position to check how the new PR behaves as baseline is broken for me already.
I will file a Jira ticket to fix the envtest on main as it seems to be broken.

yes, makes sense if we can address that separate. I just bumped it to make the test to pass in this PR. was not looking into if there is a general issue.

but it would mean we first have to fix the issue for the test, before we can do the golang bump?

@softwarefactory-project-zuul

Copy link
Copy Markdown

This change depends on a change that failed to merge.

Change openstack-k8s-operators/openstack-operator#1567 is needed.

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/8cc24bb8065d48de85178c2f896340d1

✔️ openstack-meta-content-provider SUCCESS in 3h 15m 35s
nova-operator-kuttl RETRY_LIMIT in 26m 46s
✔️ nova-operator-tempest-multinode SUCCESS in 2h 14m 49s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 49m 15s

@SeanMooney

Copy link
Copy Markdown
Contributor

meta comment is we do not allow squah merges so the fix commit shoudl be merged into the prior commit that need the fix to pass.

it woudl be good to fix that in the next revision if you dont mind.

@stuggi

stuggi commented Sep 17, 2025

Copy link
Copy Markdown
Contributor Author

meta comment is we do not allow squah merges so the fix commit shoudl be merged into the prior commit that need the fix to pass.

it woudl be good to fix that in the next revision if you dont mind.

@SeanMooney I am not sure if I fully understood what you mean. Do you mean that I merge the two commits,
[test] fix keystone endpoint reconfigure test and [test] fix mariadb switch account test, to the prior commit where I have seen those test start to fail? I did separate commits as I thought it might be easier to review, but I can squash merge those.

@SeanMooney

SeanMooney commented Sep 17, 2025

Copy link
Copy Markdown
Contributor

so if we look at
https://github.com/openstack-k8s-operators/nova-operator/pull/1007/commits
we have 6 commits

  • bump to golang 1.24
  • fix golangci reported issues
  • Bump dependencies for OpenShift 4.18 compatibility
  • [test] fix keystone endpoint reconfigure test
  • [test] fix mariadb switch account test
  • bump lib-common

the first 2 shoudl be merge together since the golang issue were intoduced by bumping the glangci version in the prior patch

if the two [test] patches are required because of " Bump dependencies for OpenShift 4.18 compatibility" it shoudl be part of that really

if not they those shoudl likely be in a sperate pr but im less concered by that.
if those are just latent bugs that happen more often with this update its fine to have them in the pr.

what we shoudl avoid is havign a commit where tests/lints fail

each commit should be valid on its own ideally.

@gibizer

gibizer commented Sep 17, 2025

Copy link
Copy Markdown
Contributor

I still see test failing intermittently after the bump as well. But as I saw this before on main this is not related to the bump.
The timeout increase was removed from the patch so that solved my immediate complain about this change. And there is a separate Jira to fix the test so I'm OK in this regard.

The rest of the patches was reviewed by others and I trust them. So if they are OK then I'm OK too.

Thanks Martin for taking care of the 1.24 bump.

Summarizing 1 Failure:
  [FAIL] Nova reconfiguration when Nova CR instance is created with topology that is later removed [It] updates topologyRef
  /op/test/functional/nova_reconfiguration_test.go:285

Ran 390 of 390 Specs in 101.075 seconds
FAIL! -- 389 Passed | 1 Failed | 0 Pending | 0 Skipped

@stuggi

stuggi commented Sep 17, 2025

Copy link
Copy Markdown
Contributor Author

so if we look at https://github.com/openstack-k8s-operators/nova-operator/pull/1007/commits we have 6 commits

  • bump to golang 1.24
  • fix golangci reported issues
  • Bump dependencies for OpenShift 4.18 compatibility
  • [test] fix keystone endpoint reconfigure test
  • [test] fix mariadb switch account test
  • bump lib-common

the first 2 shoudl be merge together since the golang issue were intoduced by bumping the glangci version in the prior patch

if the two [test] patches are required because of " Bump dependencies for OpenShift 4.18 compatibility" it shoudl be part of that really

if not they those shoudl likely be in a sperate pr but im less concered by that. if those are just latent bugs that happen more often with this update its fine to have them in the pr.

what we shoudl avoid is havign a commit where tests/lints fail

each commit should be valid on its own ideally.

ok, I'll squash those as you explained. thought its easier to review with the split.

@stuggi

stuggi commented Sep 17, 2025

Copy link
Copy Markdown
Contributor Author

I still see test failing intermittently after the bump as well. But as I saw this before on main this is not related to the bump. The timeout increase was removed from the patch so that solved my immediate complain about this change. And there is a separate Jira to fix the test so I'm OK in this regard.

The rest of the patches was reviewed by others and I trust them. So if they are OK then I'm OK too.

Thanks Martin for taking care of the 1.24 bump.

Summarizing 1 Failure:
  [FAIL] Nova reconfiguration when Nova CR instance is created with topology that is later removed [It] updates topologyRef
  /op/test/functional/nova_reconfiguration_test.go:285

Ran 390 of 390 Specs in 101.075 seconds
FAIL! -- 389 Passed | 1 Failed | 0 Pending | 0 Skipped

I haven't seen a failure with the topology test. so It seems to be unrelated and worked on in the jira you filed.

@stuggi

stuggi commented Sep 17, 2025

Copy link
Copy Markdown
Contributor Author

so if we look at https://github.com/openstack-k8s-operators/nova-operator/pull/1007/commits we have 6 commits

  • bump to golang 1.24
  • fix golangci reported issues
  • Bump dependencies for OpenShift 4.18 compatibility
  • [test] fix keystone endpoint reconfigure test
  • [test] fix mariadb switch account test
  • bump lib-common

the first 2 shoudl be merge together since the golang issue were intoduced by bumping the glangci version in the prior patch
if the two [test] patches are required because of " Bump dependencies for OpenShift 4.18 compatibility" it shoudl be part of that really
if not they those shoudl likely be in a sperate pr but im less concered by that. if those are just latent bugs that happen more often with this update its fine to have them in the pr.
what we shoudl avoid is havign a commit where tests/lints fail
each commit should be valid on its own ideally.

ok, I'll squash those as you explained. thought its easier to review with the split.

done

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/59080ca1fd4947e9b3975cfd2c925b6d

✔️ openstack-meta-content-provider SUCCESS in 2h 48m 38s
nova-operator-kuttl RETRY_LIMIT in 25m 43s
✔️ nova-operator-tempest-multinode SUCCESS in 2h 25m 36s
nova-operator-tempest-multinode-ceph FAILURE in 36m 04s

* bump in go.mod (base and api)
* bump go-toolset in Dockerfile
* bump in github jobs ('.github/workflows')
* Bump the golangci-lint version in the .pre-commit-config.yaml to v2.4.0
* Bump build_root_image in .ci-operator.yaml to ci-build-root-golang-1.24-sdk-1.31

Also fixes golangci-lint reported issues for new v2.4.0 version.

Jira: OSPRH-12935
Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
Co-authored-by: Claude (Anthropic) claude@anthropic.com
Update controller-runtime, Kubernetes dependencies, and testing tools
to support OpenShift Container Platform 4.18 (Kubernetes 1.31).

Changes:
- controller-runtime: v0.17.6 → v0.19.7
- Kubernetes core dependencies: v0.29.15 → v0.31.12
  * k8s.io/api: v0.31.12
  * k8s.io/apimachinery: v0.31.12
  * k8s.io/client-go: v0.31.12
  * k8s.io/apiextensions-apiserver: v0.31.12
- k8s.io/utils: v0.0.0-20240711033017 → v0.0.0-20250820121507
- controller-gen: v0.14.0 → v0.18.0
- envtest: 1.29 → 1.31, setup-envtest@latest

Drops Required kubebuilder tag from inlined struct types since with
controller-utils 0.18 it will result in adding empty string to the
required list of the CRD.

Also:
* [test] fix keystone endpoint reconfigure test

There can be a race in the test when the keystoneapi endpoint
changed, but we did not wait for the nova controller to start
reconcile for it. If we then already simulate the top services
and cell services the test will fail when it starts reconciling
for the keystone endpoint change.
This change waits for the controller start reconcile, before
simulate the top services and cells to be ready.

* [test] fix mariadb switch account test

In SwitchToNewAccount the resources get simulated to be ready
to allow the controller to proceed. At least seen in NovaConduction,
this has seen to can happen to be still on the generation before
the account switch.

This updates the test to wait for cell0 NovaConductor being updated
to the new generation before simulate the services to be ready.

Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/746add81a3c54670a5c21570b4d42cd4

✔️ openstack-meta-content-provider SUCCESS in 42m 28s
nova-operator-kuttl RETRY_LIMIT in 2m 48s
nova-operator-tempest-multinode FAILURE in 18m 00s
nova-operator-tempest-multinode-ceph FAILURE in 16m 33s

@fmount

fmount commented Sep 22, 2025

Copy link
Copy Markdown
Contributor

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/1aaa3a1863c44be9a832f04cf4ef625a

✔️ openstack-meta-content-provider SUCCESS in 3h 19m 18s
✔️ nova-operator-kuttl SUCCESS in 39m 36s
nova-operator-tempest-multinode FAILURE in 18m 43s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 36m 38s

@stuggi

stuggi commented Sep 23, 2025

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f8f67d5634b9440d9367b6e0f03bfe3c

✔️ openstack-meta-content-provider SUCCESS in 3h 04m 10s
✔️ nova-operator-kuttl SUCCESS in 36m 50s
nova-operator-tempest-multinode FAILURE in 20m 30s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 44m 18s

@stuggi

stuggi commented Sep 23, 2025

Copy link
Copy Markdown
Contributor Author

multi node job fails early connecting to the ocp env

2025-09-23 06:00:41,033 p=28360 u=zuul n=ansible | TASK [ci_local_storage : Create the cifmw_cls_namespace namespace" kubeconfig={{ cifmw_openshift_kubeconfig }}, api_key={{ cifmw_openshift_token | default(omit)}}, context={{ cifmw_openshift_context | default(omit) }}, name={{ cifmw_cls_namespace }}, kind=Namespace, state=present] ***
2025-09-23 06:00:41,033 p=28360 u=zuul n=ansible | Tuesday 23 September 2025  06:00:41 +0000 (0:00:00.196)       0:02:14.687 ***** 
2025-09-23 06:00:41,726 p=28360 u=zuul n=ansible | An exception occurred during task execution. To see the full traceback, use -vvv. The error was: TypeError: Value of unknown type: <class 'urllib3.exceptions.NewConnectionError'>, <urllib3.connection.HTTPSConnection object at 0x7f22f8caf880>: Failed to establish a new connection: [Errno 111] Connection refused

@stuggi

stuggi commented Sep 23, 2025

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/841d30ff545444f599e80112a3477813

✔️ openstack-meta-content-provider SUCCESS in 3h 12m 43s
✔️ nova-operator-kuttl SUCCESS in 38m 55s
nova-operator-tempest-multinode FAILURE in 20m 36s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 54m 12s

@mrkisaolamb mrkisaolamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@openshift-ci

openshift-ci Bot commented Sep 24, 2025

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprince, mrkisaolamb, stuggi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [dprince,mrkisaolamb,stuggi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mrkisaolamb

Copy link
Copy Markdown
Contributor

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/2af5f99865b8471584c00585cdb94bce

✔️ openstack-meta-content-provider SUCCESS in 3h 50m 13s
✔️ nova-operator-kuttl SUCCESS in 40m 27s
✔️ nova-operator-tempest-multinode SUCCESS in 2h 19m 05s
nova-operator-tempest-multinode-ceph TIMED_OUT in 3h 20m 08s

@stuggi

stuggi commented Sep 24, 2025

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/565cd067747a45469d017168453202d1

✔️ openstack-meta-content-provider SUCCESS in 2h 35m 15s
✔️ nova-operator-kuttl SUCCESS in 40m 01s
✔️ nova-operator-tempest-multinode SUCCESS in 2h 18m 25s
nova-operator-tempest-multinode-ceph FAILURE in 21m 37s

@mrkisaolamb

Copy link
Copy Markdown
Contributor

recheck nova-operator-tempest-multinode-ceph

@openshift-merge-bot openshift-merge-bot Bot merged commit a6113c8 into openstack-k8s-operators:main Sep 25, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants