Skip to content

Commit 578fbee

Browse files
committed
fix(bomsh): drop verifier check (C) and _bomsh.artefact
Check (C) compared two sha1s that are different by construction: the ArtifactID we wrote to `_bomsh.artefact` (sha1 of library bytes) vs. the bom_id `bomsh_sbom.py -f` inserts into the SPDX (sha1 of the OmniBOR Input Manifest, looked up via omnibor/metadata/bomsh/bomsh_omnibor_doc_mapping). Two prior attempts (be88063 snapshot, efa5977 `-g <ArtifactID>`) each fixed (C) at the cost of breaking (A): bomsh's omnibor/objects/ only stores Input Manifests keyed by bom_ids. Customers walk the ADG from the SPDX gitoid; they need only (A) "resolves in objects/" and (B) "objects/ self-consistent", both retained. (C) was CI-internal hygiene already covered by the explicit WOLFSSL_LIB_DSO_BASENAMES loop in the recipe. Drop the manifest, check (C), and its plumbing across Makefile.am, bomsh_verify.py, test_gen_sbom.py, and sbom.yml. Restore `bomsh_sbom.py -f <library>` (the bom_id it inserts resolves in objects/ by construction). Signed-off-by: Sameeh Jubran <sameeh@wolfssl.com> Signed-off-by: Sameeh Jubran <sameeh@wolfssl.com>
1 parent efa5977 commit 578fbee

4 files changed

Lines changed: 55 additions & 296 deletions

File tree

.github/workflows/sbom.yml

Lines changed: 8 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1003,28 +1003,24 @@ jobs:
10031003
pyspdxtools --infile "$f"
10041004
done
10051005
1006-
- name: Bomsh provenance is end-to-end verifiable
1007-
# Three independent self-consistency checks on the bomsh
1006+
- name: Bomsh provenance bundle is internally consistent
1007+
# Two independent self-consistency checks on the bomsh
10081008
# provenance bundle. The PERSISTENT-ID assertion above only
1009-
# proves the gitoid externalRef *exists*; none of these
1010-
# follow-up properties are guaranteed by it:
1009+
# proves the gitoid externalRef *exists*; neither of these
1010+
# follow-up properties is guaranteed by it:
10111011
#
10121012
# (A) every gitoid in the SPDX externalRefs resolves to a
10131013
# blob present in omnibor/objects/<aa>/<rest>
10141014
# (B) every blob in omnibor/objects/ round-trips through
10151015
# sha1(b"blob <len>\0" + content) so the object store
10161016
# is internally self-consistent (no bit-rot, no
10171017
# truncation, no stray non-blob file under objects/)
1018-
# (C) the gitoid recorded against the wolfSSL package equals
1019-
# the git-blob hash of the actual library artefact that
1020-
# `make bomsh` traced (the SBOM ties to the binary that
1021-
# would actually ship)
10221018
#
10231019
# Without this, a future bomsh_sbom.py change that emits a
10241020
# plausibly-shaped but fictional gitoid (one that does not
1025-
# resolve in the ADG, or resolves but to the wrong artefact)
1026-
# would pass the existing PERSISTENT-ID assertion and ship a
1027-
# provenance bundle whose externalRef is a lie.
1021+
# resolve in the ADG) would pass the existing PERSISTENT-ID
1022+
# assertion and ship a provenance bundle whose externalRef is
1023+
# a lie.
10281024
#
10291025
# The verifier logic lives in scripts/bomsh_verify.py so it can
10301026
# be unit-tested with synthetic fixtures (see the
@@ -1077,19 +1073,14 @@ jobs:
10771073
- name: Upload bomsh trace diagnostics
10781074
# Diagnostic-only, short retention. Kept separate so the
10791075
# provenance bundle above stays slim for downstream consumers
1080-
# who don't need to debug ptrace gaps. `_bomsh.artefact` is
1081-
# included here (not in the provenance bundle) because it is
1082-
# CI-internal: a pointer file recording the path and gitoid of
1083-
# the bomtrace3-traced library that bomsh_sbom.py was told to
1084-
# cite in the SPDX externalRef.
1076+
# who don't need to debug ptrace gaps.
10851077
if: always()
10861078
uses: actions/upload-artifact@v4
10871079
with:
10881080
name: bomsh-trace-diag-${{ github.sha }}
10891081
path: |
10901082
bomsh_raw_logfile.sha1
10911083
_bomsh.conf
1092-
_bomsh.artefact
10931084
if-no-files-found: warn
10941085
retention-days: 14
10951086

@@ -1104,5 +1095,3 @@ jobs:
11041095
exit 1
11051096
fi
11061097
test ! -d omnibor || (echo "omnibor/ not cleaned"; exit 1)
1107-
test ! -f _bomsh.artefact \
1108-
|| (echo "_bomsh.artefact not cleaned"; exit 1)

Makefile.am

Lines changed: 11 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -474,18 +474,6 @@ BOMSH_RAWLOG = $(BOMSH_RAWLOG_BASE).sha1
474474
BOMSH_CONF = $(abs_builddir)/_bomsh.conf
475475
BOMSH_OMNIBORDIR = $(abs_builddir)/omnibor
476476
BOMSH_SPDX_OUT = omnibor.wolfssl-$(PACKAGE_VERSION).spdx.json
477-
# Single-source-of-truth manifest of the library artefact bomtrace3
478-
# actually traced. Format: one line, '<path>\t<gitoid>'. Both fields
479-
# are captured by the bomsh: recipe right after bomtrace3 finishes, so
480-
# downstream verification (CI: `Bomsh provenance is end-to-end
481-
# verifiable`) compares the SPDX gitoid against the gitoid bomsh
482-
# itself recorded -- decoupling check (C) from the file's *current*
483-
# bytes, which `make sbom`'s subsequent `make install` step relinks
484-
# in place via libtool (RPATH fixup), changing the gitoid that would
485-
# be re-computed off the on-disk file. The verifier still warns when
486-
# the on-disk gitoid disagrees, so the install-time relink remains
487-
# visible.
488-
BOMSH_ARTEFACT_MANIFEST = $(abs_builddir)/_bomsh.artefact
489477
bomshdir = $(datadir)/doc/$(PACKAGE)
490478

491479
.PHONY: bomsh install-bomsh uninstall-bomsh
@@ -514,49 +502,29 @@ bomsh:
514502
@printf 'raw_logfile=%s\n' '$(BOMSH_RAWLOG_BASE)' > '$(BOMSH_CONF)'
515503
$(BOMTRACE3) -c '$(BOMSH_CONF)' $(MAKE)
516504
$(BOMSH_CREATE_BOM) -r '$(BOMSH_RAWLOG)' -b '$(BOMSH_OMNIBORDIR)'
517-
@# Capture the ArtifactID (file gitoid) of the bomtrace3-traced
518-
@# library and record it in the manifest. Below we feed this gitoid
519-
@# to bomsh_sbom.py via -g (NOT -f): with -f, bomsh_sbom.py hashes
520-
@# the file then maps that hash through omnibor/metadata/bomsh/
521-
@# bomsh_omnibor_doc_mapping to a bom_id (the gitoid of the
522-
@# artefact's OmniBOR document) -- a different sha1 than the
523-
@# artefact's own content gitoid, which never matches what the
524-
@# verifier records. -g inserts our gitoid verbatim, so
525-
@# SPDX externalRef == manifest gitoid == artefact ArtifactID.
526-
@bomsh_artifact=""; \
527-
for lib in \
528-
$(addprefix "$(abs_builddir)/src/.libs"/,$(WOLFSSL_LIB_DSO_BASENAMES)) \
529-
"$(abs_builddir)/src/.libs/libwolfssl.a" \
530-
"$(abs_builddir)/src/libwolfssl.a"; do \
531-
if test -f "$$lib"; then bomsh_artifact="$$lib"; break; fi; \
532-
done; \
533-
if test -n "$$bomsh_artifact"; then \
534-
bomsh_artifact_gid=`$(PYTHON3) -c 'import hashlib,sys;d=open(sys.argv[1],"rb").read();h=hashlib.sha1();h.update(("blob %d\0"%len(d)).encode());h.update(d);print(h.hexdigest())' "$$bomsh_artifact"`; \
535-
printf '%s\t%s\n' "$$bomsh_artifact" "$$bomsh_artifact_gid" \
536-
> '$(BOMSH_ARTEFACT_MANIFEST)'; \
537-
fi
538505
$(MAKE) sbom
539506
@if test -z "$(BOMSH_SBOM)"; then \
540507
echo "NOTE: bomsh_sbom.py not in PATH; skipping SPDX enrichment."; \
541508
echo " The OmniBOR graph in $(BOMSH_OMNIBORDIR) is still produced."; \
542509
exit 0; \
543510
fi; \
544-
if test ! -f '$(BOMSH_ARTEFACT_MANIFEST)'; then \
511+
bomsh_artifact=""; \
512+
for lib in \
513+
$(addprefix "$(abs_builddir)/src/.libs"/,$(WOLFSSL_LIB_DSO_BASENAMES)) \
514+
"$(abs_builddir)/src/.libs/libwolfssl.a" \
515+
"$(abs_builddir)/src/libwolfssl.a"; do \
516+
if test -f "$$lib"; then bomsh_artifact="$$lib"; break; fi; \
517+
done; \
518+
if test -z "$$bomsh_artifact"; then \
545519
echo "NOTE: no built libwolfssl artifact found in $(abs_builddir)/src/.libs/"; \
546520
echo " OmniBOR graph produced; SPDX enrichment skipped."; \
547521
exit 0; \
548522
fi; \
549-
bomsh_artifact=`awk 'NR==1 {print $$1}' '$(BOMSH_ARTEFACT_MANIFEST)'`; \
550-
bomsh_artifact_gid=`awk 'NR==1 {print $$2}' '$(BOMSH_ARTEFACT_MANIFEST)'`; \
551-
if test -z "$$bomsh_artifact_gid"; then \
552-
echo "ERROR: $(BOMSH_ARTEFACT_MANIFEST) is missing the gitoid field"; \
553-
exit 1; \
554-
fi; \
555-
echo "Enriching SPDX with OmniBOR ExternalRefs (artifact: $$bomsh_artifact, gitoid: $$bomsh_artifact_gid)..."; \
523+
echo "Enriching SPDX with OmniBOR ExternalRefs (artifact: $$bomsh_artifact)..."; \
556524
$(BOMSH_SBOM) \
557525
-b '$(BOMSH_OMNIBORDIR)' \
558526
-i '$(abs_builddir)/$(SBOM_SPDX)' \
559-
-g "$$bomsh_artifact_gid" \
527+
-f "$$bomsh_artifact" \
560528
-s spdx-json \
561529
-O '$(abs_builddir)'
562530

@@ -573,7 +541,7 @@ uninstall-bomsh:
573541
-rm -rf '$(DESTDIR)$(bomshdir)/omnibor'
574542
-rm -f '$(DESTDIR)$(bomshdir)/$(BOMSH_SPDX_OUT)'
575543

576-
CLEANFILES += $(BOMSH_RAWLOG) $(BOMSH_RAWLOG_BASE).sha256 $(BOMSH_CONF) $(BOMSH_SPDX_OUT) $(BOMSH_ARTEFACT_MANIFEST)
544+
CLEANFILES += $(BOMSH_RAWLOG) $(BOMSH_RAWLOG_BASE).sha256 $(BOMSH_CONF) $(BOMSH_SPDX_OUT)
577545

578546
# Hook SBOM/Bomsh cleanup into `make uninstall` so packagers don't leave
579547
# stale artefacts behind after install-sbom/install-bomsh. uninstall-sbom

scripts/bomsh_verify.py

Lines changed: 13 additions & 131 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,27 @@
11
#!/usr/bin/env python3
22
"""End-to-end verifier for the bomsh provenance bundle.
33
4-
Three independent self-consistency checks on the artefacts that
4+
Two independent self-consistency checks on the artefacts that
55
`make bomsh` produces. The PERSISTENT-ID assertion in the bomsh CI
66
job only proves the gitoid externalRef *exists* in the enriched SPDX;
7-
none of these follow-up properties are guaranteed by it:
7+
neither of these follow-up properties is guaranteed by it:
88
99
(A) Resolvability -- every gitoid in the SPDX externalRefs resolves
10-
to a blob present at omnibor/objects/<aa>/<rest>.
10+
to a blob present at omnibor/objects/<aa>/<rest>. Catches the
11+
`bomsh_sbom.py` regression class that emits a syntactically
12+
well-formed gitoid which does not actually point at anything in
13+
the shipped ADG.
1114
1215
(B) Object-store integrity -- every blob in omnibor/objects/
1316
round-trips through sha1(b"blob <len>\\0" + content), so a
1417
corrupt or truncated object store is caught at PR time, not by
1518
a downstream verifier weeks later.
1619
17-
(C) Artefact correspondence -- the gitoid recorded against the
18-
wolfSSL package equals the gitoid bomsh itself recorded for the
19-
library it traced (read from the `_bomsh.artefact` manifest the
20-
bomsh: Makefile target writes as '<path>\\t<gitoid>' BEFORE
21-
`make sbom` runs). This is the strongest claim the bomsh
22-
pipeline alone can make: the SPDX agrees with what bomsh saw.
23-
24-
Comparing against bomsh's own recorded gitoid (rather than
25-
against the on-disk file's *current* bytes) is deliberate.
26-
`make sbom`'s subsequent `make install` step relinks
27-
src/.libs/lib*.so* in place via libtool to fix RPATH, mutating
28-
the bytes after bomsh has already gitoid-ed them. The verifier
29-
still hashes the on-disk file and emits a NOTE if it has
30-
diverged, so the install-time relink remains visible without
31-
causing a false negative on the bomsh<->SPDX agreement.
32-
33-
Without this, a future `bomsh_sbom.py` change that emits a
34-
plausibly-shaped but fictional gitoid (one that does not resolve in
35-
the ADG, or resolves but to a different artefact than bomsh recorded)
36-
would pass the existing PERSISTENT-ID assertion and ship a provenance
37-
bundle whose externalRef is a lie.
38-
3920
CLI form (used by `.github/workflows/sbom.yml`):
4021
4122
python3 scripts/bomsh_verify.py \\
4223
--spdx-glob 'omnibor.wolfssl-*.spdx.json' \\
43-
--omnibor-dir omnibor \\
44-
--artefact-manifest _bomsh.artefact
24+
--omnibor-dir omnibor
4525
4626
Library form (used by scripts/test_gen_sbom.py):
4727
@@ -55,7 +35,7 @@
5535
import json
5636
import os
5737
import sys
58-
from typing import List, Tuple
38+
from typing import List
5939

6040

6141
GITOID_LOCATOR_PREFIX = 'gitoid:blob:sha1:'
@@ -159,59 +139,11 @@ def check_object_store_integrity(omnibor_objects_dir):
159139
return obj_count, bad
160140

161141

162-
def parse_artefact_manifest(manifest_path):
163-
"""Parse the `_bomsh.artefact` manifest written by the bomsh:
164-
recipe. Format: a single line, `<absolute-path>\\t<gitoid-hex>`
165-
-- both fields captured by the recipe AFTER bomtrace3 finishes
166-
but BEFORE `make sbom` relinks the library.
167-
168-
Returns (path, recorded_gid). Raises FileNotFoundError if the
169-
manifest does not exist (bomsh: skipped artefact discovery, e.g.
170-
no built library); raises ValueError if the line is malformed."""
171-
if not os.path.isfile(manifest_path):
172-
raise FileNotFoundError(
173-
f'{manifest_path} not produced by `make bomsh`; cannot '
174-
f'verify gitoid <-> artefact correspondence. This usually '
175-
f'means the bomsh enrichment step skipped the artefact-'
176-
f'discovery loop (no built library).')
177-
with open(manifest_path) as f:
178-
line = f.readline().rstrip('\n')
179-
if not line:
180-
raise ValueError(
181-
f'{manifest_path} is empty; bomsh: recipe wrote nothing')
182-
parts = line.split('\t')
183-
if len(parts) != 2 or not all(parts):
184-
raise ValueError(
185-
f'{manifest_path}: expected "<path>\\t<gitoid>", got {line!r}. '
186-
f'Re-run `make bomsh` against an up-to-date Makefile.am.')
187-
return parts[0], parts[1]
188-
189-
190-
def check_artefact_correspondence(spdx_gitoids, recorded_gid,
191-
package_name_substr='wolfssl'):
192-
"""(C) The gitoid bomsh recorded for the traced library matches a
193-
gitoid externalRef on the wolfSSL SPDX package. This is the
194-
bomsh<->SPDX agreement check; it does NOT compare against the
195-
on-disk file's current bytes (see module docstring).
196-
197-
Returns (matched, wolfssl_gids). Raises ValueError if no SPDX
198-
gitoid is associated with a wolfSSL-named package."""
199-
wolfssl_gids = [gid for name, gid in spdx_gitoids
200-
if package_name_substr in name.lower()]
201-
if not wolfssl_gids:
202-
raise ValueError(
203-
f'no SPDX gitoid externalRef on a package whose name '
204-
f'contains {package_name_substr!r}; cannot verify '
205-
f'artefact correspondence')
206-
return recorded_gid in wolfssl_gids, wolfssl_gids
207-
208-
209-
def verify(spdx_glob, omnibor_dir, artefact_manifest,
210-
package_name_substr='wolfssl'):
211-
"""Orchestrate the three checks. Returns (ok: bool, messages:
142+
def verify(spdx_glob, omnibor_dir):
143+
"""Orchestrate the two checks. Returns (ok: bool, messages:
212144
List[str]). `messages` is appended to in success and failure both,
213-
so callers can log the success line ('OK: N gitoids verified ...')
214-
even when ok is True."""
145+
so callers can log the success lines ('OK: N gitoid(s) verified' +
146+
' objects round-trip: M blobs') even when ok is True."""
215147
messages: List[str] = []
216148

217149
spdx_paths = sorted(_glob.glob(spdx_glob))
@@ -247,50 +179,8 @@ def verify(spdx_glob, omnibor_dir, artefact_manifest,
247179
f'round-trip (object store is corrupt)')
248180
return False, messages
249181

250-
try:
251-
artefact, recorded_gid = parse_artefact_manifest(artefact_manifest)
252-
except (FileNotFoundError, ValueError) as e:
253-
messages.append(str(e))
254-
return False, messages
255-
256-
try:
257-
matched, wolfssl_gids = check_artefact_correspondence(
258-
spdx_gitoids, recorded_gid, package_name_substr)
259-
except ValueError as e:
260-
messages.append(str(e))
261-
return False, messages
262-
263-
if not matched:
264-
messages.append(
265-
f'wolfSSL package SPDX gitoids {wolfssl_gids} do not '
266-
f'include the gitoid bomsh recorded for the traced '
267-
f'artefact {artefact} ({recorded_gid}); the SBOM is '
268-
f'inconsistent with what bomsh actually saw')
269-
return False, messages
270-
271182
messages.append(f'OK: {len(spdx_gitoids)} gitoid(s) verified')
272183
messages.append(f' objects round-trip: {obj_count} blobs')
273-
messages.append(
274-
f' artefact match: {artefact} -> {recorded_gid} (bomsh-traced)')
275-
276-
# Diagnostic-only: the on-disk file may have been rewritten since
277-
# bomsh saw it (the canonical case is `make sbom`'s `make install`
278-
# step relinking via libtool to fix RPATH). We do NOT fail on
279-
# this -- the SBOM<->bomsh agreement above is what matters for
280-
# the provenance proof -- but surfacing it as a NOTE keeps the
281-
# divergence visible so it does not silently grow into a
282-
# bigger gap (e.g. someone adds a strip step that goes unflagged).
283-
if os.path.isfile(artefact):
284-
on_disk = gitoid_sha1(artefact)
285-
if on_disk != recorded_gid:
286-
messages.append(
287-
f'NOTE: on-disk {artefact} now has gitoid {on_disk}, '
288-
f'but bomsh recorded {recorded_gid}. This is expected '
289-
f'when `make sbom` runs `make install` (libtool relinks '
290-
f'src/.libs/lib*.so* in place to fix RPATH). The SBOM '
291-
f'attests to the bomsh-traced bytes; if you need it to '
292-
f'attest to the *installed* bytes, the bomsh: recipe '
293-
f'must trace `make install` too.')
294184
return True, messages
295185

296186

@@ -304,17 +194,9 @@ def main():
304194
parser.add_argument('--omnibor-dir', default='omnibor',
305195
help='Path to the OmniBOR directory containing '
306196
'objects/ (default: %(default)s)')
307-
parser.add_argument('--artefact-manifest', default='_bomsh.artefact',
308-
help='Path to the file containing the artefact '
309-
'path that bomsh: traced (default: %(default)s)')
310-
parser.add_argument('--package-name-substr', default='wolfssl',
311-
help='Case-insensitive substring used to identify '
312-
'the wolfSSL SPDX package among any others in '
313-
'the document (default: %(default)s)')
314197
args = parser.parse_args()
315198

316-
ok, messages = verify(args.spdx_glob, args.omnibor_dir,
317-
args.artefact_manifest, args.package_name_substr)
199+
ok, messages = verify(args.spdx_glob, args.omnibor_dir)
318200
for line in messages:
319201
print(line, file=sys.stderr if not ok else sys.stdout)
320202
sys.exit(0 if ok else 1)

0 commit comments

Comments
 (0)