Skip to content

Commit f4a161c

Browse files
committed
add a new UAPI.16 File Manifest spec
1 parent cd215e9 commit f4a161c

1 file changed

Lines changed: 219 additions & 0 deletions

File tree

specs/file-manifest.md

Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
---
2+
title: UAPI.16 File Manifests
3+
layout: posts
4+
version: 1.0
5+
weight: 6
6+
aliases:
7+
- /UAPI.16
8+
- /16
9+
---
10+
11+
# UAPI.16 File Manifest
12+
13+
| Version | Changes |
14+
|---------|-----------------|
15+
| 1.0 | Initial Release |
16+
17+
This format stores information about static, immutable data resources that can be acquired and stored on a
18+
POSIX file system or written to a GPT disk partition.
19+
20+
This manifest file format is inspired by the traditional UNIX `SHA256SUMS` and BSD `mtree(5)` file formats,
21+
but supersede them in various ways:
22+
23+
1. It allows declaring encoding of data files (i.e. `gzip` compression and suchlike)
24+
2. It allows referencing remote URLs as source for acquiring file data
25+
2. It allows inline specification of file data (suitable for short data only)
26+
3. It allows referencing arbitrary local files as source for acquiring file data
27+
4. It allows extracting data "slices" from data sources, via offset and range
28+
5. It may store both encoded and decoded data sizes
29+
6. Various fields of additional per-file metadata may be defined, including various fields for writing data to GPT partition tables
30+
7. It's an extensible format permitting vendors and projects to add their own per-file and per-manifest fields.
31+
32+
Among other things, to be generated by mkosi, and consumed by systemd-sysupdate.
33+
34+
## Manifest Object
35+
36+
This is the top-level object for a manifest file. It's designed to be
37+
extensible and self-identifies as manifest.
38+
39+
```json
40+
{
41+
"mediaType" : "application/vnd.uapi.manifest"
42+
"files" : [ ]
43+
}
44+
```
45+
46+
### Notes
47+
48+
The `files` field is just an array of file objects, see below.
49+
50+
We might eventually add additional fields to the manifest, for in-line signatures and similar.
51+
52+
## Files Object
53+
54+
This is an object that encodes information about an individual file.
55+
56+
```json
57+
{
58+
"name" : "",
59+
"dataEncoding" : "gzip"
60+
"encodedDataSize" : 2011,
61+
"dataSize" : 4711,
62+
"dataFile" : "",
63+
"dataUrl" : "http://…"
64+
"dataLiteral" : "base64…"
65+
"sliceOffset" : 22,
66+
"sliceSize" : 33,
67+
"sha256" : "",
68+
"gptLabel" : "",
69+
"gptTypeUuid" : "",
70+
"gptFlagNoAuto" : true,
71+
"gptFlagGrowFileSystem" : true,
72+
"readOnly" : true,
73+
"validFromUSec": 4711,
74+
"validUntilUSec" : 4711,
75+
}
76+
```
77+
78+
### Notes
79+
80+
All fields but `name` are optional.
81+
82+
The `name` must be a valid UTF-8 POSIX filename, may not contain control characters (ASCI 0…31, 127), may not
83+
contain a slash, and may not be identical to `.`, `..`, ``. It may have a maximum length of 255 bytes.
84+
85+
Only one of `dataFile`, `dataUrl`, `dataLiteral` should be set. dataFile references a file in the same place
86+
as the manifest, dataUrl a full URL, and dataLiteral inline Base64.
87+
88+
`sliceOffset`, `sliceSize` are apply to the *decoded* data blob. If not specified explicitly should default
89+
to a zero offset, and a slice size equal to the full decoded size.
90+
91+
`dataSize` is the size of the decoded blob, `encodedDataSize` of the encoded blob. If `dataEncoding` is not
92+
specified, `encodedDataSize` should not be specified.
93+
94+
`dataEncoding` should use the same encoding specifiers as HTTP.
95+
96+
`sha256Sum` is the SHA256 hash of the specified slice, formatted in 64 hexadecimal characters. Parsers should
97+
parse this case-insensitively.
98+
99+
The `gpt*` fields encode fields that we need when placing these resources in a GPT partition table.
100+
101+
If `gptLabel` is unspecified, and the data is written to a GPT partition it should use the value of `name` as
102+
label.
103+
104+
`readOnly` is relevant both for GPT partition table and should be reflected in the 'w' access bit when
105+
written to a regular file.
106+
107+
`valid*USec` is in µs since UNIX epoch, defines a validity time window of the resource. The data should not
108+
be accessed or consumed outside of the specified time window. If `validFromUSec` is not specified it should
109+
default to 0, if `validToUSec` is not specified it should default to infinity.
110+
111+
If neither `dataFile`, nor `dataUrl`, nor `dataLiteral` are specified, the `name` field should be used as
112+
fallback equivalent to it being repeated in `dataFile`.
113+
114+
`dataLiteral` may be encoded either in regular Base64 or in URL-safe Base64, consumers must be able to deal
115+
with either.
116+
117+
Additional hash algorithms may be defined in future.
118+
119+
Sizes, offsets, timestamps shall use positive integer JSON numbers, and may use values above 2^52. (This
120+
means strictly speaking the format cannot be read by JSON parsers which exclusively use 64bit floating point
121+
for JSON numbers, if there are any files of such excessive sizes. Since it's unlikely that files included in
122+
a UAPI.16 manifest file are this large this should not be a practical limitation.)
123+
124+
## Relationship to SHA256SUMS
125+
126+
A UAPI manifest file that only uses the fields `name` and `sha256Sum` can be mapped 1:1 to a SHA256SUMS file
127+
and back.
128+
129+
## Relationship to `mtree(5)`
130+
131+
Manifests following this specification should be mappable 1:1
132+
[`mtree(5)`](https://man.freebsd.org/cgi/man.cgi?mtree(5)) as long as:
133+
134+
1. Only the `name`, `dataSize`, `sha256` fields are used in the files objects, as per this specification.
135+
136+
2. Only the `size`, `sha256` fields are used in the `mtree(5)` files, as well as `type` is set to
137+
`file`. Only regular filenames may be specified as path, i.e. no `/` may be included.
138+
139+
## Acquiring a File Remotely
140+
141+
When acquiring a file listed in a UAPI.16 File Manifest from a web service, the following logic should be
142+
implemented.
143+
144+
1. If `validFromUSec` is set and greater than the current time, the download should immediately fail.
145+
2. If `validUntilUSec` is set and lower than the current time, the download should immediately fail.
146+
3. If `dataLiteral` is set, it should be Base64 decoded, continue in step 7
147+
4. Otherwise, if `dataUrl` is set, the data from the URL should be acquired, continue in step 7
148+
5. Otherwise, if `dataFile` is set, the data should be acquired from the same URL as the manifest itself, however with the last component of the URL replaced by the file name encoded in `dataFile`, following the same semantics as HTML relative links. Continue in step 7.
149+
6. Otherwise, the data should be acquired as in step 3, but the `name` string should be used as file name to suffix the URL with.
150+
7. If `encodedDataSize` or `dataSize` are set: the size of the downloaded data shall be checked against `encodedDataSize` (if set) or `dataSize` (otherwise).
151+
8. If `encoding` is set: the downloaded data shall be decoded according to the algorithm indicated in `encoding`.
152+
9. if `dataSize` is set: the resulting decoded data shall be checked against `dataSize`.
153+
10. The byte range indicated by `sliceOffset` (if not set: 0) and `sliceSize` (if not set, to the end of the file) shall be extracted from the decoded data.
154+
11. The hash of the extracted data shall be matched against the hash encoded in `sha256Sum` (if set).
155+
156+
If all these steps succeed the extracted data from step 10 is the result of the operation.
157+
158+
Of course, many of the steps described above should typically be done together rather than serially for
159+
robustness, efficiency and security reasons. For example, if `encoding` is not used, it is recommended to
160+
include the `sliceOffset` and `sliceSize` fields in HTTP range request fields already (in order to avoid
161+
downloading redundant data). Moreover, downloads should fail immediately once the encoded or decoded data
162+
goes beyond the indicated encoded or decoded data sizes. Then, the decoding should be done on-the-fly while
163+
the data is downloaded, and the checksum be calculated on-the-fly too.
164+
165+
## Extensibility
166+
167+
Additional fields can be defined freely by implementors. Each such extension field should be named in the
168+
style of `x<Vendor>Foobar` to minimize risk of conflicts. Example `xAmutableFrobnicator` or
169+
`xMyProjectWeight`.
170+
171+
172+
## Example
173+
174+
```json
175+
{
176+
"mediaType" : "application/vnd.uapi.manifest",
177+
"files" : [
178+
{
179+
"name" : "FooOS.raw",
180+
"dataEncoding": "gzip",
181+
"encodedDataSize": 5642649603,
182+
"dataSize" : 7523532800,
183+
"sha256" : "922a9bae0e02b4ffac3e5ed5054230d0689b9c2e25b0178ba82b925f2a0c3e48",
184+
"validUntilUSec" : 1776856773123234
185+
},
186+
{
187+
"name" : "FooOS_esp.raw",
188+
"dataFile" : "FooOS.raw",
189+
"dataEncoding": "gzip",
190+
"encodedDataSize": 5642649603,
191+
"dataSize" : 7523532800,
192+
"sliceOffset" : 2097152,
193+
"sliceSize" : 149175808,
194+
"gptLabel" : "EFI System Partition",
195+
"gptTypeUuid" : "c12a7328-f81f-11d2-ba4b-00a0c93ec93b",
196+
"sha256" : "5dcfd837a4868550cc61c256d9567a974e32a20985afa9e100b8b96755a20cae",
197+
"validUntilUSec" : 1776856773123234
198+
},
199+
{
200+
"name" : "FooOS_root.raw",
201+
"dataFile" : "FooOS.raw",
202+
"dataEncoding": "gzip",
203+
"encodedDataSize": 5642649603,
204+
"dataSize" : 7523532800,
205+
"sliceOffset" : 351272960,
206+
"sliceSize" : 234003200,
207+
"sha256" : "00b70a0813e15f309828d3a36156283cba87576c26755e0b2d4cf0951eff8163",
208+
"gptLabel" : "FooOS_root",
209+
"gptTypeUuid": "4f68bce3-e8cd-4db1-96e7-fbcaf984b709",
210+
"readOnly": true,
211+
"validUntilUSec" : 1776856773123234
212+
}
213+
]
214+
}
215+
```
216+
217+
The above provides three separate files `FooOS.raw`, `FooOS-esp.raw`, `FooOS-root.raw`. The latter two are
218+
defined as slices of the former, each encapsulating an individual partition. The data file is encoded via
219+
`gzip`.

0 commit comments

Comments
 (0)