Skip to content

Commit 18a15ed

Browse files
nemarjanopenshift-merge-bot[bot]
authored andcommitted
[env_op_images] Add CRI-O pull verification to pulled-images report
Cross-reference the pulled-images report with CRI-O journal logs from cluster nodes to confirm which images were actually pulled by the container runtime. Runs automatically when kubeconfig is defined, same as the pulled-images report itself. Co-authored-by: Cursor <cursor@cursor.com> Signed-off-by: nemarjan <nemarjan@redhat.com>
1 parent 21bda82 commit 18a15ed

10 files changed

Lines changed: 504 additions & 96 deletions

File tree

ci/playbooks/collect-logs.yml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,15 @@
11
---
2+
- name: Ensure ci-framework-data base directories exist on all nodes
3+
hosts: all
4+
gather_facts: false
5+
tasks:
6+
- name: Create ci-framework-data/logs directory if missing
7+
ansible.builtin.file:
8+
path: "{{ ansible_user_dir }}/ci-framework-data/logs"
9+
state: directory
10+
mode: "0755"
11+
ignore_errors: true # noqa: ignore-errors
12+
213
- name: "Run ci/playbooks/collect-logs.yml"
314
hosts: "{{ cifmw_zuul_target_host | default('all') }}"
415
gather_facts: true

docs/dictionary/en-custom.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ CPython
66
ClusterServiceVersion
77
FreeIPA
88
IDM
9+
ICSP
10+
IDMS
911
IMVHO
1012
IdP
1113
Idempotency

roles/env_op_images/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@ A role to gather the container images used in the openstack deployment with spec
44
## Parameters
55
* `cifmw_env_op_images_dir`: (String) Directory where the operator_images.yaml will be stored. Defaults to `~/ci-framework-data/artifacts`
66
* `cifmw_env_op_images_file`: (String) Name of the file storing the operator images and tags. Defaults to `operator_images.yaml`
7+
* `cifmw_env_op_images_pulled_report_path`: Pulled-images policy report (ICSP/IDMS + pod image refs).
8+
* `cifmw_env_op_images_verified_report_path`: Output path for the CRI-O-enriched report. After the pulled report runs, fetches `oc adm node-logs NODE -u crio` per node, then writes this file with digest-level CRI-O fields (`node_verified_image_origin`, `log_evidence_uri`, `log_evidence_node`).
9+
* `cifmw_env_op_images_crio_logs_dir`: Directory for per-node `*.crio.log` files used during verification.
710

811
## Examples
912
```YAML

roles/env_op_images/defaults/main.yml

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,20 @@ cifmw_env_op_images_dir: "{{ cifmw_basedir }}"
2222
cifmw_env_op_images_file: operator_images.yaml
2323
cifmw_env_op_images_dryrun: false
2424

25-
cifmw_env_op_images_pulled_report_file: pulled_images_report.yaml
25+
cifmw_env_op_images_pulled_report_path: >-
26+
{{
27+
(cifmw_env_op_images_dir, 'artifacts', 'pulled_images_report.yaml')
28+
| path_join
29+
}}
30+
31+
cifmw_env_op_images_verified_report_path: >-
32+
{{
33+
(cifmw_env_op_images_dir, 'artifacts', 'pulled_images_report_verified.yaml')
34+
| path_join
35+
}}
36+
cifmw_env_op_images_crio_logs_dir: >-
37+
{{ (cifmw_env_op_images_dir, 'artifacts', 'crio_logs') | path_join }}
38+
2639
cifmw_env_op_images_pulled_report_namespaces:
2740
- "{{ cifmw_openstack_namespace | default('openstack') }}"
2841
- "{{ operator_namespace | default('openstack-operators') }}"
Lines changed: 297 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,297 @@
1+
#!/usr/bin/python
2+
3+
# Copyright Red Hat, Inc.
4+
# All Rights Reserved.
5+
#
6+
# Licensed under the Apache License, Version 2.0 (the "License"); you may
7+
# not use this file except in compliance with the License. You may obtain
8+
# a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
14+
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
15+
# License for the specific language governing permissions and limitations
16+
# under the License.
17+
18+
from __future__ import absolute_import, division, print_function
19+
20+
__metaclass__ = type
21+
22+
DOCUMENTATION = r"""
23+
---
24+
module: verify_pulled_report_crio
25+
26+
short_description: Enrich pulled_images_report with CRI-O pull evidence
27+
28+
description:
29+
- Reads the YAML produced by the env_op_images pulled-images report role task.
30+
- "Parses CRI-O journal lines for C(msg=\"Pulled image: ...@sha256:...\")."
31+
- Adds per-row verification fields using trusted mirror domains from
32+
C(summary.mirror_rules).
33+
- When images carry a C(node) field, evidence is matched against the
34+
specific node's CRI-O log first. If the digest is only found on a
35+
different node the entry is counted as a cross-node mismatch.
36+
- Log files are expected to follow the C(<node-name>.crio.log) naming
37+
convention produced by the role task.
38+
39+
options:
40+
report_path:
41+
description: Path to C(pulled_images_report.yaml) (input).
42+
required: true
43+
type: str
44+
output_path:
45+
description: Path for the enriched YAML report (output).
46+
required: true
47+
type: str
48+
log_paths:
49+
description:
50+
- Explicit list of log files to parse (e.g. per-node CRI-O logs).
51+
- Combined with files found under I(log_dir) when set.
52+
required: false
53+
type: list
54+
elements: str
55+
default: []
56+
log_dir:
57+
description:
58+
- Directory containing CRI-O log files matching I(log_glob).
59+
required: false
60+
type: str
61+
log_glob:
62+
description: Glob under I(log_dir). Used only when I(log_dir) is set.
63+
required: false
64+
default: "*.crio.log"
65+
type: str
66+
67+
author:
68+
- Nemanja Marjanovic (@nemarjan)
69+
70+
notes:
71+
- Requires PyYAML on the controller (same as other cifmw.general modules).
72+
"""
73+
74+
EXAMPLES = r"""
75+
- name: Enrich pulled report using fetched node logs
76+
cifmw.general.verify_pulled_report_crio:
77+
report_path: "{{ cifmw_env_op_images_pulled_report_path }}"
78+
log_dir: "{{ cifmw_env_op_images_crio_logs_dir }}"
79+
output_path: "{{ cifmw_env_op_images_verified_report_path }}"
80+
"""
81+
82+
RETURN = r"""
83+
changed:
84+
description: Whether the output file was written.
85+
type: bool
86+
returned: always
87+
trusted_mirrors:
88+
description: Hostnames extracted from mirror rules in the report summary.
89+
type: list
90+
elements: str
91+
returned: always
92+
log_files:
93+
description: Number of log files read.
94+
type: int
95+
returned: always
96+
entries_with_digest:
97+
description: Image rows that had a sha256 digest in C(image_id).
98+
type: int
99+
returned: always
100+
cross_node_entries:
101+
description: >-
102+
Image rows where evidence was found only on a different node
103+
than where the pod ran.
104+
type: int
105+
returned: always
106+
nodes_with_evidence:
107+
description: >-
108+
Node names that had at least one C(Pulled image) log entry.
109+
type: list
110+
elements: str
111+
returned: always
112+
"""
113+
114+
import glob
115+
import os
116+
import re
117+
118+
import yaml
119+
from ansible.module_utils.basic import AnsibleModule
120+
121+
LOG_PATTERN = re.compile(
122+
r'msg="Pulled image: (?P<actual_uri>[^@\s]+)@(?P<id>sha256:[a-f0-9]+)"'
123+
)
124+
SHA256_PATTERN = re.compile(r"sha256:[a-f0-9]+")
125+
126+
127+
def _node_from_path(path):
128+
"""Derive node name from the ``<node>.crio.log`` naming convention."""
129+
basename = os.path.basename(path)
130+
suffix_pos = basename.find(".crio.log")
131+
if suffix_pos > 0:
132+
return basename[:suffix_pos]
133+
return os.path.splitext(basename)[0]
134+
135+
136+
def _domain_from_uri(uri):
137+
"""Return the registry host (+ optional port) from an image URI."""
138+
return uri.split("/")[0].strip()
139+
140+
141+
def _apply_evidence(img, actual_uri, evidence_node, trusted_mirrors):
142+
"""Set common verification fields on an image row that has log evidence."""
143+
actual_domain = _domain_from_uri(actual_uri)
144+
img["node_verified_image_origin"] = (
145+
"mirror" if actual_domain in trusted_mirrors else "source"
146+
)
147+
img["log_evidence_uri"] = actual_uri
148+
img["log_evidence_node"] = evidence_node
149+
return actual_domain
150+
151+
152+
def _collect_log_evidence(paths, module):
153+
"""Parse CRI-O logs into per-node and global evidence dicts.
154+
155+
Returns:
156+
per_node: ``{node_name: {sha256_digest: pull_uri}}``
157+
global_evidence: ``{sha256_digest: (pull_uri, node_name)}``
158+
(last writer wins across nodes for the global dict)
159+
"""
160+
per_node = {}
161+
global_evidence = {}
162+
for path in paths:
163+
node = _node_from_path(path)
164+
node_ev = per_node.setdefault(node, {})
165+
try:
166+
with open(path, "r") as f:
167+
for line in f:
168+
match = LOG_PATTERN.search(line)
169+
if match:
170+
digest = match.group("id")
171+
uri = match.group("actual_uri")
172+
node_ev[digest] = uri
173+
global_evidence[digest] = (uri, node)
174+
except IOError as exc:
175+
module.fail_json(
176+
msg="Cannot read CRI-O log file {0}: {1}".format(path, str(exc))
177+
)
178+
return per_node, global_evidence
179+
180+
181+
def run_module():
182+
module_args = dict(
183+
report_path=dict(type="str", required=True),
184+
output_path=dict(type="str", required=True),
185+
log_paths=dict(type="list", required=False, elements="str", default=[]),
186+
log_dir=dict(type="str", required=False),
187+
log_glob=dict(type="str", required=False, default="*.crio.log"),
188+
)
189+
190+
module = AnsibleModule(argument_spec=module_args, supports_check_mode=True)
191+
192+
report_path = module.params["report_path"]
193+
output_path = module.params["output_path"]
194+
log_paths = module.params["log_paths"] or []
195+
log_dir = module.params["log_dir"]
196+
log_glob = module.params["log_glob"]
197+
198+
paths = list(log_paths)
199+
if log_dir:
200+
paths.extend(sorted(glob.glob(os.path.join(log_dir, log_glob))))
201+
202+
if not paths:
203+
module.fail_json(
204+
msg="No CRI-O log files: set log_paths and/or log_dir with matching files."
205+
)
206+
207+
try:
208+
with open(report_path, "r") as f:
209+
data = yaml.safe_load(f)
210+
except IOError as exc:
211+
module.fail_json(
212+
msg="Cannot read report {0}: {1}".format(report_path, str(exc))
213+
)
214+
except yaml.YAMLError as exc:
215+
module.fail_json(msg="Invalid YAML in report: {0}".format(str(exc)))
216+
217+
if not isinstance(data, dict):
218+
module.fail_json(msg="Report root must be a mapping (dict).")
219+
220+
trusted_mirrors = set()
221+
summary_section = data.get("summary") or {}
222+
for rule in summary_section.get("mirror_rules") or []:
223+
if not isinstance(rule, dict):
224+
continue
225+
mirror_url = rule.get("mirror")
226+
if mirror_url:
227+
domain = _domain_from_uri(mirror_url)
228+
if domain:
229+
trusted_mirrors.add(domain)
230+
231+
per_node_evidence, global_evidence = _collect_log_evidence(paths, module)
232+
233+
images_list = data.get("images") or []
234+
entries_with_digest = 0
235+
cross_node_entries = 0
236+
for img in images_list:
237+
if not isinstance(img, dict):
238+
continue
239+
image_id = img.get("image_id") or ""
240+
sha_match = SHA256_PATTERN.search(image_id)
241+
if not sha_match:
242+
continue
243+
entries_with_digest += 1
244+
img_sha = sha_match.group(0)
245+
img_node = img.get("node") or ""
246+
247+
node_local_hit = (
248+
img_node
249+
and img_node in per_node_evidence
250+
and img_sha in per_node_evidence[img_node]
251+
)
252+
253+
if node_local_hit:
254+
actual_uri = per_node_evidence[img_node][img_sha]
255+
_apply_evidence(img, actual_uri, img_node, trusted_mirrors)
256+
elif img_sha in global_evidence:
257+
actual_uri, evidence_node = global_evidence[img_sha]
258+
_apply_evidence(img, actual_uri, evidence_node, trusted_mirrors)
259+
if img_node:
260+
cross_node_entries += 1
261+
else:
262+
img["node_verified_image_origin"] = "cached/unknown"
263+
img["log_evidence_uri"] = None
264+
img["log_evidence_node"] = None
265+
266+
nodes_with_evidence = sorted(n for n, ev in per_node_evidence.items() if ev)
267+
result = dict(
268+
changed=False,
269+
trusted_mirrors=sorted(trusted_mirrors),
270+
log_files=len(paths),
271+
entries_with_digest=entries_with_digest,
272+
cross_node_entries=cross_node_entries,
273+
nodes_with_evidence=nodes_with_evidence,
274+
)
275+
276+
if module.check_mode:
277+
result["changed"] = True
278+
module.exit_json(**result)
279+
280+
try:
281+
with open(output_path, "w") as f:
282+
yaml.dump(data, f, default_flow_style=False, sort_keys=False)
283+
except IOError as exc:
284+
module.fail_json(
285+
msg="Cannot write verified report {0}: {1}".format(output_path, str(exc))
286+
)
287+
288+
result["changed"] = True
289+
module.exit_json(**result)
290+
291+
292+
def main():
293+
run_module()
294+
295+
296+
if __name__ == "__main__":
297+
main()

roles/env_op_images/tasks/main.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,3 +158,7 @@
158158

159159
- name: Generate pulled images registry report
160160
ansible.builtin.include_tasks: pulled_images_report.yml
161+
162+
- name: Verify pulled report against CRI-O node logs
163+
ansible.builtin.include_tasks: verify_pulled_report_crio.yml
164+
ignore_errors: true # noqa: ignore-errors

0 commit comments

Comments
 (0)