|
| 1 | +--- |
| 2 | +title: UAPI.16 File Manifest |
| 3 | +layout: posts |
| 4 | +version: 1.0 |
| 5 | +weight: 6 |
| 6 | +aliases: |
| 7 | +- /UAPI.16 |
| 8 | +- /16 |
| 9 | +--- |
| 10 | + |
| 11 | +# UAPI.16 File Manifest |
| 12 | + |
| 13 | +| Version | Changes | |
| 14 | +|---------|-----------------| |
| 15 | +| 1.0 | Initial Release | |
| 16 | + |
| 17 | +This format stores information about static, immutable data resources that can be acquired over the network |
| 18 | +and stored on a disk. |
| 19 | + |
| 20 | +This manifest file format is inspired by the traditional UNIX `SHA256SUMS` and BSD `mtree(5)` file formats, |
| 21 | +but has a larger set of features: |
| 22 | + |
| 23 | +1. It allows declaring the encoding of data files (i.e. `gzip` compression and suchlike). |
| 24 | +2. It allows referencing remote URLs as source for acquiring file data. |
| 25 | +3. It allows inline specification of file data (suitable for short data only). |
| 26 | +4. It allows referencing arbitrary local files as source for acquiring file data. |
| 27 | +5. It allows extracting data "slices" from data sources, via offset and range. |
| 28 | +6. It may store both encoded and decoded data sizes. |
| 29 | +7. Various fields of additional per-file metadata may be defined, including various fields for GPT partition table metadata. |
| 30 | +8. It's an extensible format permitting vendors and projects to add their own per-file and per-manifest fields. |
| 31 | + |
| 32 | +This format is designed with tools such as `mkosi` and `systemd-sysupdate` in mind, but it's intended to be |
| 33 | +generally useful as a way to describe collections of files that may be acquired over the network or from |
| 34 | +local storage, and verified cryptographically. |
| 35 | + |
| 36 | +## Manifest File Name and Media Type |
| 37 | + |
| 38 | +When stored in a file system directory – alongside the data files it references – the manifest file should be |
| 39 | +named `Uapi16ManifestFile`. |
| 40 | + |
| 41 | +Similarly, when placed on an HTTP server it's a good idea to name the last component of the URL |
| 42 | +`/Uapi16ManifestFile`, however this is just a suggestion in order to minimize surprises and maximize |
| 43 | +discoverability. |
| 44 | + |
| 45 | +When served via an HTTP server it's recommended to use the media type `application/vnd.uapi.16.file.manifest`. |
| 46 | + |
| 47 | +## Manifest Object |
| 48 | + |
| 49 | +This is the top-level object for a manifest file. It's designed to be |
| 50 | +extensible and self-identifies as a manifest. |
| 51 | + |
| 52 | +```json |
| 53 | +{ |
| 54 | + "mediaType" : "application/vnd.uapi.16.file.manifest", |
| 55 | + "files" : [ … ] |
| 56 | +} |
| 57 | +``` |
| 58 | + |
| 59 | +### Notes |
| 60 | + |
| 61 | +The `files` field is an array of file objects, see below. |
| 62 | + |
| 63 | +Files may appear in any order in the manifest, but it's recommended to sort them alphabetically by their name. |
| 64 | + |
| 65 | +The `name` fields of file objects are supposed to be uniquely assigned within each manifest file. |
| 66 | + |
| 67 | +## File Object |
| 68 | + |
| 69 | +This is an object that encodes information about an individual file. |
| 70 | + |
| 71 | +```json |
| 72 | +{ |
| 73 | + "name" : "…", |
| 74 | + "dataEncoding" : "gzip", |
| 75 | + "encodedDataSize" : 2011, |
| 76 | + "dataSize" : 4711, |
| 77 | + "dataFile" : "…", |
| 78 | + "dataUrl" : "http://…", |
| 79 | + "dataLiteral" : "TmV2ZXIgZ29ubmEgZ2l2ZSB5b3UgdXAsIG5ldmVyIGdvbm5hIGxldCB5b3UgZG93bg==", |
| 80 | + "sliceOffset" : 22, |
| 81 | + "sliceSize" : 33, |
| 82 | + "sha256" : "…", |
| 83 | + "gptLabel" : "…", |
| 84 | + "gptTypeUuid" : "…", |
| 85 | + "gptFlagNoAuto" : true, |
| 86 | + "gptFlagGrowFileSystem" : true, |
| 87 | + "readOnly" : true, |
| 88 | + "validAfterUSec": 4711, |
| 89 | + "validBeforeUSec" : 4711, |
| 90 | + "tags" : ["tag1", "tag2", …], |
| 91 | + "revoked" : false, |
| 92 | + "steppingStone": false, |
| 93 | +} |
| 94 | +``` |
| 95 | + |
| 96 | +### Notes |
| 97 | + |
| 98 | +All fields but `name` are optional. If a field shall not be set it can either be set to `null` or simply |
| 99 | +omitted in the JSON object, both ways shall be considered equivalent. |
| 100 | + |
| 101 | +If not otherwise indicated all fields are supposed to be of type string. |
| 102 | + |
| 103 | +The `name` must be a valid UTF-8 POSIX filename, may not contain control characters (ASCII 0…31, 127), may |
| 104 | +not contain a slash, and may not be identical to "`.`", "`..`", "``" (but may contain these strings). It may |
| 105 | +have a maximum length of 255 bytes. Files whose names do not match these rules cannot be encoded in manifest |
| 106 | +files defined by this specification. |
| 107 | + |
| 108 | +Only one of `dataFile`, `dataUrl`, `dataLiteral` should be set. `dataFile` references a file in the same |
| 109 | +place as the manifest file itself, `dataUrl` a full URL, and `dataLiteral` inline Base64. `dataUrl` may use |
| 110 | +`http://…` and `https://…` schemes only. |
| 111 | + |
| 112 | +`dataSize` is the size of the decoded blob, `encodedDataSize` of the encoded blob. If `dataEncoding` is not |
| 113 | +specified, `encodedDataSize` should not be specified. Both are unsigned integer type. |
| 114 | + |
| 115 | +`dataEncoding` should use the same encoding specifiers as HTTP. |
| 116 | + |
| 117 | +`sliceOffset`, `sliceSize` (unsigned integers) are applied to the *decoded* data blob (i.e. after |
| 118 | +decompression). If not specified explicitly, they should default to a zero offset, and a slice size equal to |
| 119 | +the full decoded size. |
| 120 | + |
| 121 | +`sha256` is the SHA256 hash of the specified slice, formatted in 64 hexadecimal characters. Parsers should |
| 122 | +parse this case-insensitively. |
| 123 | + |
| 124 | +The `gpt*` fields encode fields that we need when placing these resources in a GPT partition table entry. The |
| 125 | +`gptFlag*` fields are of boolean type. The `gptTypeUuid` field takes a UUID in traditional string formatting. |
| 126 | + |
| 127 | +The `gptLabel` field may not be longer than 72 characters. If `gptLabel` is unspecified, and the data is |
| 128 | +written to a GPT partition it should use the value of `name` as label instead. Note that this might require |
| 129 | +mangling, as GPT partition labels have smaller size limits than file names. |
| 130 | + |
| 131 | +`readOnly` (a boolean) is relevant for initializing the read-only bit of GPT partition table entries. It |
| 132 | +should also be reflected in the POSIX 'w' access mode bit when storing the data in a regular file on disk. If |
| 133 | +unspecified, defaults to false. |
| 134 | + |
| 135 | +`valid*USec` is in µs since UNIX epoch UTC, and shall be an unsigned integer. It defines a validity time |
| 136 | +window of the resource. The data should not be accessed or consumed outside of the specified time window. If |
| 137 | +`validAfterUSec` is not specified it should default to 0, if `validBeforeUSec` is not specified it should |
| 138 | +default to the largest supported unsigned integer value. |
| 139 | + |
| 140 | +If neither `dataFile`, nor `dataUrl`, nor `dataLiteral` are specified, the `name` field should be used as |
| 141 | +fallback equivalent to it being repeated in `dataFile`. |
| 142 | + |
| 143 | +`dataLiteral` may be encoded either in regular Base64 or in URL-safe Base64, consumers must be able to deal |
| 144 | +with either. |
| 145 | + |
| 146 | +`tags` is an array of arbitrary usecase-specific strings. This may be used for filtering, for example to |
| 147 | +categorize files in various ways (for example, files could be tagged "unstable", "stable", "lto", to indicate |
| 148 | +stability levels, or similar). |
| 149 | + |
| 150 | +`revoked` is a boolean. If true, the file should not be requested anymore, and if it has been acquired |
| 151 | +before, appropriate steps should be taken to invalidate/remove it from use, if possible (details are |
| 152 | +implementation-specific). If unspecified, defaults to `false`. |
| 153 | + |
| 154 | +`steppingStone` is a boolean, defaulting to `false`. When this manifest file is processed by a software |
| 155 | +update tool, and lists multiple versions of the same resource in individual files, then the update tool |
| 156 | +should ensure that files where this property is set to `true` are never skipped for version updates, and |
| 157 | +installed and activated before considering any later versions of the same resource. |
| 158 | + |
| 159 | +Additional hash algorithms may be defined in future. |
| 160 | + |
| 161 | +Sizes, offsets, timestamps shall use positive integer JSON numbers, and may use values above 2⁵³. (This means |
| 162 | +– strictly speaking – the format cannot be processed losslessly by JSON parsers that exclusively use 64bit |
| 163 | +floating point for JSON numbers, if there are any files of such excessive sizes. Since it's unlikely that |
| 164 | +files included in a UAPI.16 manifest file are this large this should not be a practical limitation.) |
| 165 | + |
| 166 | +## Relationship to SHA256SUMS |
| 167 | + |
| 168 | +A UAPI manifest file that only uses the fields `name` and `sha256` can be mapped 1:1 to a SHA256SUMS file |
| 169 | +and back. |
| 170 | + |
| 171 | +## Relationship to `mtree(5)` |
| 172 | + |
| 173 | +Manifests following this specification should be mappable 1:1 to |
| 174 | +[`mtree(5)`](https://man.freebsd.org/cgi/man.cgi?mtree(5)) as long as: |
| 175 | + |
| 176 | +1. Only the `name`, `dataSize`, `sha256` fields are used in the files objects, as per this specification. |
| 177 | + |
| 178 | +2. Only the `size`, `sha256` fields are used in the `mtree(5)` files, as well as `type` is set to |
| 179 | + `file`. Only regular filenames may be specified as path, i.e. no `/` may be included. |
| 180 | + |
| 181 | +## Acquiring a File Remotely |
| 182 | + |
| 183 | +When acquiring a file listed in a UAPI.16 File Manifest from a web service, the following logic should be |
| 184 | +implemented. |
| 185 | + |
| 186 | +1. If `revoked` is set to true, the download should immediately fail. |
| 187 | +2. If `validAfterUSec` is set and greater than the current time, the download should immediately fail. |
| 188 | +3. If `validBeforeUSec` is set and lower than the current time, the download should immediately fail. |
| 189 | +4. If `dataLiteral` is set, it should be Base64 decoded, continue in step 8. |
| 190 | +5. Otherwise, if `dataUrl` is set, the data from the URL should be acquired, continue in step 8. |
| 191 | +6. Otherwise, if `dataFile` is set, the data should be acquired from the same URL as the manifest itself, however with the last component of the URL replaced by the file name encoded in `dataFile`, following the same semantics as HTML relative links. Continue in step 8. |
| 192 | +7. Otherwise, the data should be acquired as in step 5, but the `name` string should be used as file name to suffix the URL with. |
| 193 | +8. If `encodedDataSize` or `dataSize` are set: the size of the downloaded data shall be checked against `encodedDataSize` (if set) or `dataSize` (otherwise). |
| 194 | +9. If `dataEncoding` is set: the downloaded data shall be decoded according to the algorithm indicated in `dataEncoding`. |
| 195 | +10. If `dataSize` is set: the resulting decoded data shall be checked against `dataSize`. |
| 196 | +11. The byte range indicated by `sliceOffset` (if not set: 0) and `sliceSize` (if not set, to the end of the file) shall be extracted from the decoded data. |
| 197 | +12. The hash of the extracted data shall be matched against the hash encoded in `sha256` (if set). |
| 198 | + |
| 199 | +If all these steps succeed the extracted data from step 11 is the result of the operation. |
| 200 | + |
| 201 | +Of course, many of the steps described above should typically be done together rather than serially for |
| 202 | +robustness, efficiency and security reasons. For example, if `dataEncoding` is not used, it is recommended to |
| 203 | +include the `sliceOffset` and `sliceSize` fields in HTTP range request fields already (in order to avoid |
| 204 | +downloading redundant data). Moreover, downloads should fail immediately once the encoded or decoded data |
| 205 | +goes beyond the indicated encoded or decoded data sizes. Then, the decoding should be done on-the-fly while |
| 206 | +the data is downloaded, and the checksum should be calculated on-the-fly too. |
| 207 | + |
| 208 | +## Extensibility |
| 209 | + |
| 210 | +Additional fields can be defined freely by implementors. Each such extension field should be named in the |
| 211 | +style of `x<Vendor>Foobar` to minimize risk of conflicts. Example `xAmutableFrobnicator` or |
| 212 | +`xMyProjectWeight`. |
| 213 | + |
| 214 | +## Future Revisions |
| 215 | + |
| 216 | +This specification is intended to be incrementally improved. Additional fields may be defined at any time, |
| 217 | +with additional, optional metadata. Should a breaking change be necessary the media type will be changed. |
| 218 | + |
| 219 | +A number of future extensions are envisioned: |
| 220 | + |
| 221 | +* Inline cryptographic signatures |
| 222 | + |
| 223 | +## Example |
| 224 | + |
| 225 | +```json |
| 226 | +{ |
| 227 | + "mediaType" : "application/vnd.uapi.manifest", |
| 228 | + "files" : [ |
| 229 | + { |
| 230 | + "name" : "FooOS.raw", |
| 231 | + "dataEncoding": "gzip", |
| 232 | + "encodedDataSize": 5642649603, |
| 233 | + "dataSize" : 7523532800, |
| 234 | + "sha256" : "922a9bae0e02b4ffac3e5ed5054230d0689b9c2e25b0178ba82b925f2a0c3e48", |
| 235 | + "validBeforeUSec" : 1776856773123234 |
| 236 | + }, |
| 237 | + { |
| 238 | + "name" : "FooOS_esp.raw", |
| 239 | + "dataFile" : "FooOS.raw", |
| 240 | + "dataEncoding": "gzip", |
| 241 | + "encodedDataSize": 5642649603, |
| 242 | + "dataSize" : 7523532800, |
| 243 | + "sliceOffset" : 2097152, |
| 244 | + "sliceSize" : 149175808, |
| 245 | + "gptLabel" : "EFI System Partition", |
| 246 | + "gptTypeUuid" : "c12a7328-f81f-11d2-ba4b-00a0c93ec93b", |
| 247 | + "sha256" : "5dcfd837a4868550cc61c256d9567a974e32a20985afa9e100b8b96755a20cae", |
| 248 | + "validBeforeUSec" : 1776856773123234 |
| 249 | + }, |
| 250 | + { |
| 251 | + "name" : "FooOS_root.raw", |
| 252 | + "dataFile" : "FooOS.raw", |
| 253 | + "dataEncoding": "gzip", |
| 254 | + "encodedDataSize": 5642649603, |
| 255 | + "dataSize" : 7523532800, |
| 256 | + "sliceOffset" : 351272960, |
| 257 | + "sliceSize" : 234003200, |
| 258 | + "sha256" : "00b70a0813e15f309828d3a36156283cba87576c26755e0b2d4cf0951eff8163", |
| 259 | + "gptLabel" : "FooOS_root", |
| 260 | + "gptTypeUuid": "4f68bce3-e8cd-4db1-96e7-fbcaf984b709", |
| 261 | + "readOnly": true, |
| 262 | + "validBeforeUSec" : 1776856773123234 |
| 263 | + } |
| 264 | + ] |
| 265 | +} |
| 266 | +``` |
| 267 | + |
| 268 | +The above provides three separate files `FooOS.raw`, `FooOS-esp.raw`, `FooOS-root.raw`. The latter two are |
| 269 | +defined as slices of the former, each encapsulating an individual partition. The data file is encoded via |
| 270 | +`gzip`. |
0 commit comments