Fix for PGO-2380:#4163
Conversation
Only add logrotate volume mounts to instance pod when backups are enabled. Add kuttl tests to ensure that collector will run on postgres instance when backups are disabled.
| 2. Add a backups spec to the new cluster and ensure that pgbackrest is added to the instance pod, a repo-host pod is created, and the collector runs on both pods. | ||
| 3. Remove the backups spec from the new cluster. | ||
| 4. Annotate the cluster to allow backups to be removed. | ||
| 5. Ensure that the repo-host pod is destroyed, pgbackrest is removed from the instance pod, and the collector continues to run on the instance pod. |
There was a problem hiding this comment.
I am curious -- if you skip steps 3 and 4, will 5 fail?
There was a problem hiding this comment.
Not sure I follow... If you don't remove the backups spec and annotate the cluster, pgbackrest will just continue to run (in the instance pod and repo-host)...
There was a problem hiding this comment.
Right -- I'm wondering if step 5 tests what we want it to test.
There was a problem hiding this comment.
So, first we create a brand new cluster that does not have backups enabled. And we make sure that the collector runs in that scenario. Then we add backups and ensure that everything works properly with that transition. Then we remove backups and ensure that everything works properly with that transition... What scenario do you think we aren't testing? Or what is it about the test that is insufficient?
There was a problem hiding this comment.
It's been a while since I did a KUTTL test, but I remember running into issues where I asserted that X existed, but it actually ignored what wasn't X. So here, we assert there's a few containers, but not the collector container -- but will it fail if that collector container does exist? Or will it just skip that?
There was a problem hiding this comment.
Ahh, I think I understand now... In the assert step we tell it to check the instance pod for specific containers and it will fail if the list is not exactly correct (if we ask about 3 containers, but there are actually 5 containers that include the 3 we ask about, it will fail).
There was a problem hiding this comment.
Ah, it checks if the list (of containers, etc.) is complete still? I think we (and others) were pushing to get it to check only for the presence of the items in the list.
|
|
||
| pod=$(kubectl get pods -o name -n "${NAMESPACE}" \ | ||
| -l postgres-operator.crunchydata.com/cluster=otel-cluster-no-backups,postgres-operator.crunchydata.com/data=postgres) | ||
| [ "$pod" = "" ] && retry "Pod not found" && exit 1 |
There was a problem hiding this comment.
❓ What is the retry mechanism? It looks like the retry function just prints a message and sleeps.
There was a problem hiding this comment.
Yeah, "retry" is a bit of a misnomer -- Joseph and I talked about this when he introduced the mechanism. As you point out, the retry just print and sleeps; the actual retrying is because
(a) this exits 1 and
(b) it's a TestAssert
In KUTTL, failed TestAsserts automatically retry a number of times.
…add the instrumentation spec to PGAdmin.
Only add logrotate volume mounts to instance pod when backups are enabled. Add kuttl tests to ensure that collector will run on postgres instance when backups are disabled.
Checklist:
Type of Changes:
What is the current behavior (link to any open issues here)?
At the moment, we cannot run the collector on a postgres instance pod when backups are disabled as the pod will fail due to it attempting to mount a file that does not exist.
What is the new behavior (if this is a feature change)?
We only mount logrotate config files for the instance pod when backups are enabled, which allows the collector to run whether backups are enabled or not.
Other Information: