OSAC-669: Apply PinnedImageSet to handle LZ registry failover#381
OSAC-669: Apply PinnedImageSet to handle LZ registry failover#381danielerez wants to merge 1 commit into
Conversation
To ensure the mgmt cluster can recover from node reboots and crashes even without an available registry, image resilience is necessary. This can be achieved by pinning the required images using the PinnedImageSet functionality, as detailed in the OpenShift documentation on Pinning and Preloading Images [*]. This action stops CRI-O from triggering Garbage Collection for the images, and thus ensuring the the required images exist for getting the cluster back to an healthy status. [*] https://docs.redhat.com/en/documentation/openshift_container_platform/4.21/html/machine_configuration/machine-config-pin-preload-images-about#machine-config-pin-preload-images-about
WalkthroughThe PR adds a feature to pin container images from Kubernetes master nodes in disconnected Quay deployments. It introduces a new Jinja2 template for PinnedImageSet manifests, an Ansible task to collect image digests from master nodes via oc debug and crictl, a task to generate and apply the manifest, integration into the Quay finish workflow, and configuration toggles to enable or disable the feature. ChangesMaster Node Image PinnedImageSet Feature
Sequence DiagramsequenceDiagram
participant DeployScript as Deploy Script
participant QuayFinish as Quay Finish Workflow
participant DigestCollection as Digest Collection Task
participant MasterNode as Master Node (oc debug)
participant PinnedGeneration as PinnedImageSet Generation
participant Cluster as OpenShift Cluster
DeployScript->>QuayFinish: quay_pinnedimageset_enabled flag
alt quay_pinnedimageset_enabled and not mirror_dry_run
QuayFinish->>DigestCollection: Include collection task
DigestCollection->>MasterNode: oc debug to run crictl images
MasterNode-->>DigestCollection: JSON image list with digests
DigestCollection-->>DigestCollection: Parse, normalize, deduplicate
DigestCollection-->>QuayFinish: Collected pinned_images list
QuayFinish->>PinnedGeneration: Render template with images
PinnedGeneration-->>PinnedGeneration: Generate pinnedimageset-master.yaml
PinnedGeneration->>Cluster: oc apply pinnedimageset-master.yaml
Cluster-->>PinnedGeneration: Applied
else Disabled or dry-run
QuayFinish-->>QuayFinish: Skip PinnedImageSet step
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
can we obtain the list of images using |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@playbooks/tasks/collect_master_node_image_digests.yaml`:
- Around line 120-121: The shell line currently in the task injects '{{
pinned_image_exclude_contains | default(_pinned_image_exclude_contains_default)
| to_json }}' into a single-quoted command which breaks if the value contains a
single quote; instead set an environment variable (e.g. PINNED_IMAGE_EXCLUDE)
for that JSON by adding an env: PINNED_IMAGE_EXCLUDE: "{{
pinned_image_exclude_contains | default(_pinned_image_exclude_contains_default)
| to_json }}" to the same Ansible task and update the command to use the env var
(reference "$PINNED_IMAGE_EXCLUDE" in double-quotes) rather than inlining the
JSON string.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5318c4c4-48f2-41c5-9b40-c52fc2606416
📒 Files selected for processing (6)
config/global.example.yamloperators/quay-operator/quay_disconnected_finish.yamloperators/quay-operator/quay_pinnedimageset.yamlplaybooks/tasks/collect_master_node_image_digests.yamlscripts/deployment/deploy_phase.shtemplates/pinnedimageset.yaml.j2
| '{{ pinned_image_exclude_contains | default(_pinned_image_exclude_contains_default) | to_json }}' \ | ||
| || true)" |
There was a problem hiding this comment.
Avoid direct shell interpolation of JSON config.
The JSON argument is injected into a single-quoted shell string. If pinned_image_exclude_contains contains a ', the command can break (and can become shell-injectable). Pass this value through an environment variable instead.
Suggested hardening
- name: Collect container image digest refs from each master node
ansible.builtin.shell: |
@@
refs="$(python3 -c '
@@
- ' \
- "$tmp" \
- '{{ pinned_image_exclude_contains | default(_pinned_image_exclude_contains_default) | to_json }}' \
+ ' \
+ "$tmp" \
+ "${PINNED_IMAGE_EXCLUDE_JSON:-[]}" \
|| true)"
@@
environment:
KUBECONFIG: "{{ workingDir }}/ocp-cluster/auth/kubeconfig"
+ PINNED_IMAGE_EXCLUDE_JSON: "{{ pinned_image_exclude_contains | default(_pinned_image_exclude_contains_default) | to_json }}"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@playbooks/tasks/collect_master_node_image_digests.yaml` around lines 120 -
121, The shell line currently in the task injects '{{
pinned_image_exclude_contains | default(_pinned_image_exclude_contains_default)
| to_json }}' into a single-quoted command which breaks if the value contains a
single quote; instead set an environment variable (e.g. PINNED_IMAGE_EXCLUDE)
for that JSON by adding an env: PINNED_IMAGE_EXCLUDE: "{{
pinned_image_exclude_contains | default(_pinned_image_exclude_contains_default)
| to_json }}" to the same Ansible task and update the command to use the env var
(reference "$PINNED_IMAGE_EXCLUDE" in double-quotes) rather than inlining the
JSON string.
Yeah, that was my first attempt:) |
| # Default excludes skip openshift-pipelines/*, /openshift/graph-image (Cincinnati graph-data image), | ||
| # /redhat/community-operator-index, /redhat/certified-operator-index, /redhat/redhat-marketplace-index, | ||
| # /redhat/redhat-operator-index, /rhel9/support-tools (single-manifest refs → exit 125). | ||
| - name: Collect container image digest refs from each master node |
There was a problem hiding this comment.
inline scripts are complicated to debug and maintain. Although it is not exactly a reconcile like command, we can add it as a reconcile command for the moment and think about we want to organize the whole enclave-cli afterwards. We already have a cli command that is called from a playbook, see https://github.com/rh-ecosystem-edge/enclave/blob/main/playbooks/tasks/configure_operator.yaml#L146
To ensure the mgmt cluster can recover from node reboots and crashes even without an available registry, image resilience is necessary. This can be achieved by pinning the required images using the PinnedImageSet functionality, as detailed in the OpenShift documentation on Pinning and Preloading Images [*].
This action stops CRI-O from triggering Garbage Collection for the images, and thus ensuring the the required images exist for getting the cluster back to an healthy status.
[*] https://docs.redhat.com/en/documentation/openshift_container_platform/4.21/html/machine_configuration/machine-config-pin-preload-images-about#machine-config-pin-preload-images-about
Summary by CodeRabbit
Release Notes
quay_pinnedimageset_enabledconfiguration; automatically skipped during mirror_dry_run