Skip to content

Commit 88aa7f2

Browse files
committed
Add blog post: SBOM Storage Tax
1 parent c110536 commit 88aa7f2

2 files changed

Lines changed: 84 additions & 0 deletions

File tree

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
+++
2+
author = "Jason Smith"
3+
title = "The SBOM Storage Tax: Optimization at Scale"
4+
date = "2026-03-02"
5+
linkedin = "https://www.linkedin.com/posts/j28smith_sbom-supplychainsecurity-finops-activity-7434359722511601665-DPTT"
6+
image = "img/thirdparty/2026-03-02-sbom-storage-tax.png"
7+
+++
8+
9+
Following my last post on the "Storage Tax" of binary blob signing, I received some insightful feedback from the
10+
community. The common critique was:
11+
12+
> JSON minification doesn't really matter if you compress the JSON.
13+
14+
It's a valid point. Technically, compression will save far more than minification. But does the choice to sign the
15+
"data" (allowing minification) still matter if we are just going to run it through Zstandard (zstd) anyway?
16+
17+
To find out, I ran a series of tests on the sample set of CycloneDX and SPDX SBOMs from my
18+
[sbom-signing-best-practices](https://github.com/shiftleftcyber/sbom-signing-best-practices) GitHub repo. I compared
19+
the storage footprints of pretty-printed files, minified files, and their compressed counterparts.
20+
21+
The data confirms the initial intuition, but only if you look at the surface-level percentages. Under the hood, there
22+
is a "hidden" efficiency gain that still justifies a data-aware signing strategy.
23+
24+
## The Technical Reality Check
25+
26+
Based on my analysis of the test data, here is the breakdown of average storage savings:
27+
28+
- **Minification:** Reduced file size by ~32%.
29+
- **Zstd Compression:** Reduced file size by ~79%.
30+
- **The "Combo" (Minify + Zstd):** Reduced file size by ~81%.
31+
32+
At first glance, the "compression-only" camp seems to have won. Adding minification to the compression pipeline only
33+
appears to save an additional 2% relative to the original file size.
34+
35+
**The 2% Trap:** While 2% of the total original file seems negligible, we don't pay for storage based on the original
36+
file size. We pay based on the final compressed artifact. When we look at the results through a "FinOps" lens, that
37+
small delta becomes more significant.
38+
39+
## The 11% Efficiency Gain
40+
41+
When we look at the final artifact size (the compressed file we actually pay to store), the story changes.
42+
43+
In my tests, applying minification before Zstandard compression resulted in a final file that was 11% smaller on
44+
average than compressing the pretty-printed version. By stripping away the "entropy" of unnecessary whitespace and
45+
newlines first, the compression algorithm can focus purely on data patterns, leading to a tighter result.
46+
47+
Let's scale that 11% efficiency gain to the requirements of the **EU Cyber Resilience Act (CRA)**:
48+
49+
1. **10-Year Retention:** You aren't storing one SBOM. You are storing a decade of build history. 11% savings today
50+
compounded over time is a massive reduction in "whitespace debt" by year ten.
51+
2. **Enterprise Scale:** For thousands of software products and services, each with daily or even per-commit builds,
52+
this is the difference between manageable cold storage and a budget-breaking line item.
53+
3. **Global Traffic:** 11% less data means lower egress costs for every audit and transfer.
54+
55+
In this context, an 11% reduction in your long-term storage footprint isn't just "technical elegance". It's a
56+
significant reduction in operational overhead and "whitespace debt".
57+
58+
## The Recommended Storage Strategy
59+
60+
If you are locked into "Binary Blob Signing", you are effectively forbidden from fully optimizing your data. To keep
61+
that signature valid for a decade, you must store the compressed "Pretty-Printed" version. As a result, you are paying
62+
an 11% storage tax indefinitely.
63+
64+
To avoid the storage tax, our recommended strategy for long-term SBOM retention is:
65+
66+
1. **Implement Data-Aware Signing:** Stop signing the container (the file) and start signing the "Facts" (the
67+
canonicalized data).
68+
2. **Minify + Compress for Storage:** Use JSON minification to strip out the structural overhead and a modern algorithm
69+
like Zstandard on the minified version of the SBOM to reach peak efficiency.
70+
71+
This approach ensures that your signatures are resilient to formatting changes while your storage is optimized for the
72+
next 10 years of compliance.
73+
74+
## Final Remarks
75+
76+
I initially brought up the storage topic during my last presentation at the [OpenSSF](https://openssf.org/) SBOM
77+
Everywhere SIG meeting when presenting my findings on SBOM signing best practices and it sparked an immediate shift
78+
in the conversation. A special thank you to [Kate Stewart](https://www.linkedin.com/in/katestewartaustin/) for
79+
inviting me to the SPDX tech call to give the same presentation there. And thanks to the OpenSSF SBOM Everywhere SIG
80+
members and the SPDX Tech call members for all the feedback that prompted this deeper dive.
81+
82+
We are moving toward a world where signing the "Facts" isn't just more secure, it's cheaper.
83+
84+
**The benchmark is being set.** Are you signing the container, or are you signing the data?
2.02 MB
Loading

0 commit comments

Comments
 (0)