[ci_nmstate] Check if kubeconfig file exists#3512
Conversation
The ci_nmstate role was failing during adoption deploy-infra jobs when cifmw_openshift_kubeconfig was defined but the file didn't exist yet. Root cause: ci-framework-jobs' adoption-uni-job-base uses variable_files_dirs to scan all YAML files in the scenario directory, including 05-tests.yaml which sets cifmw_openshift_kubeconfig. This has been the case since adoption jobs were introduced in October 2024. However, during the deploy-infra phase (before deploy-ocp), the OCP cluster and kubeconfig file don't exist yet. The issue was likely exposed by PR openstack-k8s-operators#3471 which changed how ansible_user_dir is evaluated, affecting how/when the kubeconfig path gets resolved. Fix: Add "cifmw_openshift_kubeconfig is exists" check to tasks that use the kubeconfig. The existing code already handles the skipped task gracefully via default([]) safeguards, treating all hosts as "unmanaged" when no k8s cluster is available (which is correct for infra creation). Fixes: OSPCIX-1122 Related: openstack-k8s-operators#3471 Assisted-By: Claude Code/claude-4.5-sonnet Signed-off-by: Harald Jensås <hjensas@redhat.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ae35e9bb56e9400ebebde13780dd1e88 ❌ openstack-k8s-operators-content-provider FAILURE in 12m 34s |
|
recheck |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/058a6c579aac46759b2024308c0f4137 ❌ openstack-k8s-operators-content-provider FAILURE in 15m 11s |
|
recheck |
|
I guess |
|
There is a change being tested in the job configurations that disable variable dir loading, if the variable file with the cifmw_openshift_kubeconfig definition is not loaded ci_nmstate would skip the task on the "is defined" test returning false. Marking this as do-not-merge for now. @rebtoor FYI. |
| when: cifmw_openshift_kubeconfig is defined | ||
| when: | ||
| - cifmw_openshift_kubeconfig is defined | ||
| - cifmw_openshift_kubeconfig is exists |
There was a problem hiding this comment.
maybe:
cifmw_openshift_kubeconfig | length > 0 might be better than exists.
|
Closeing this, as I understand it this is happening due to a design issue in downstream jobs where variables are loaded from a directory - resulting in vars being defined too early in the run. There is a discussion to re-think and do it differently. So this change is not required. Ref gitlab ci-framework-jobs MR: 2622 |
Pull request was closed
The ci_nmstate role was failing during adoption deploy-infra jobs when cifmw_openshift_kubeconfig was defined but the file didn't exist yet.
Root cause: ci-framework-jobs' adoption-uni-job-base uses variable_files_dirs to scan all YAML files in the scenario directory, including 05-tests.yaml which sets cifmw_openshift_kubeconfig. This has been the case since adoption jobs were introduced in October 2024.
However, during the deploy-infra phase (before deploy-ocp), the OCP cluster and kubeconfig file don't exist yet. The issue was likely exposed by PR #3471 which changed how ansible_user_dir is evaluated, affecting how/when the kubeconfig path gets resolved.
Fix: Add "
cifmw_openshift_kubeconfig is exists" check to tasks that use the kubeconfig. The existing code already handles the skipped task gracefully viadefault([])safeguards, treating all hosts as "unmanaged" when no k8s cluster is available (which is correct for infra creation).Depends-On: openstack-k8s-operators/install_yamls#1110
Fixes: OSPCIX-1122
Related: #3471
Assisted-By: Claude Code/claude-4.5-sonnet