Skip to content

fix based on the flaky finalizers test - finalizers ordering and init#1460

Merged
egegunes merged 3 commits into
mainfrom
fix-flaky-finalizers-test
Feb 26, 2026
Merged

fix based on the flaky finalizers test - finalizers ordering and init#1460
egegunes merged 3 commits into
mainfrom
fix-flaky-finalizers-test

Conversation

@gkech
Copy link
Copy Markdown
Contributor

@gkech gkech commented Feb 25, 2026

CHANGE DESCRIPTION

Problem:

The following test has been failing randomly for quite some time

Screenshot 2026-02-25 at 11 05 19 AM

Cause:
Short explanation of the root cause of the issue if applicable.

Solution:

Scenario that panics and happens ~50% of the time:

  1. stop-watchers iterates first and then the stopExternalWatchers runs (no ToCrunchy) > Patch removes it > cr.PGVector reset to nil
  2. delete-ssl iterates next > deleteTLSSecrets > ToCrunchy > *cr.Spec.Extensions.BuiltIn.PGVector with nil pointer > panic

Scenario that doesn't panic:

  1. delete-ssl iterates first > ToCrunchy with PGVector=&false (still valid from Default()) > no panic
  2. stop-watchers iterates next > no ToCrunchy > no panic

By adding nil guards in ToCrunchy it doesn't panic when the *bool extension fields are nil (which can happen when cr is refreshed from the server after a Patch resets the in-memory defaults). The crunchy cluster's extension fields are plain bool, so unset *bool pointers correctly leave them as false. Also the . runFinalizers iterates over a map — which has random iteration order in Go, so we can make that deterministic.

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported PG version?
  • Does the change support oldest and newest supported Kubernetes version?

@gkech gkech changed the title fix flaky finalizers test fix based on the flaky finalizers test - finalizers ordering and init Feb 25, 2026
@gkech
Copy link
Copy Markdown
Contributor Author

gkech commented Feb 25, 2026

small test:

package main

import "fmt"

func main() {
	values := map[string]string{
		"key1": "value1",
		"key2": "value2",
		"key3": "value3",
		"key4": "value4",
		"key5": "value5",
		"key6": "value6",
		"key7": "value7",
		"key8": "value8",
	}

	for key, value := range values {
		fmt.Println(key, value)
	}
}

Prints:

key3 value3
key4 value4
key5 value5
key6 value6
key7 value7
key8 value8
key1 value1
key2 value2

@gkech gkech marked this pull request as ready for review February 25, 2026 09:24
@JNKPercona
Copy link
Copy Markdown
Collaborator

Test Name Result Time
backup-enable-disable passed 00:10:07
builtin-extensions passed 00:05:36
cert-manager-tls passed 00:06:04
custom-envs passed 00:18:24
custom-extensions passed 00:13:14
custom-tls passed 00:05:20
database-init-sql passed 00:02:38
demand-backup passed 00:24:03
demand-backup-offline-snapshot passed 00:12:43
dynamic-configuration passed 00:03:26
finalizers passed 00:03:54
init-deploy passed 00:03:05
huge-pages passed 00:02:45
monitoring passed 00:07:06
monitoring-pmm3 passed 00:08:05
one-pod passed 00:05:45
operator-self-healing passed 00:10:36
pitr passed 00:14:08
scaling passed 00:04:48
scheduled-backup passed 00:24:05
self-healing passed 00:09:02
sidecars passed 00:02:42
standby-pgbackrest passed 00:11:57
standby-streaming passed 00:09:02
start-from-backup passed 00:10:35
tablespaces passed 00:06:41
telemetry-transfer passed 00:04:33
upgrade-consistency passed 00:05:35
upgrade-minor passed 00:06:00
users passed 00:04:33
Summary Value
Tests Run 30/30
Job Duration 01:23:38
Total Test Time 04:16:47

commit: 1e36c1b
image: perconalab/percona-postgresql-operator:PR-1460-1e36c1b8e

@egegunes egegunes merged commit e6f94b3 into main Feb 26, 2026
16 checks passed
@egegunes egegunes deleted the fix-flaky-finalizers-test branch February 26, 2026 07:44
@egegunes egegunes added this to the v2.9.0 milestone Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants