|
| 1 | +--- |
| 2 | +title: UAPI.16 File Manifests |
| 3 | +layout: posts |
| 4 | +version: 1.0 |
| 5 | +weight: 6 |
| 6 | +aliases: |
| 7 | +- /UAPI.16 |
| 8 | +- /16 |
| 9 | +--- |
| 10 | + |
| 11 | +# UAPI.16 File Manifest |
| 12 | + |
| 13 | +| Version | Changes | |
| 14 | +|---------|-----------------| |
| 15 | +| 1.0 | Initial Release | |
| 16 | + |
| 17 | +This format stores information about static, immutable data resources that can be acquired and stored on a |
| 18 | +POSIX file system or written to a GPT disk partition. |
| 19 | + |
| 20 | +This manifest file format is inspired by the traditional UNIX `SHA256SUMS` and BSD `mtree(5)` file formats, |
| 21 | +but supersede them in various ways: |
| 22 | + |
| 23 | +1. It allows declaring encoding of data files (i.e. `gzip` compression and suchlike) |
| 24 | +2. It allows referencing remote URLs as source for acquiring file data |
| 25 | +2. It allows inline specification of file data (suitable for short data only) |
| 26 | +3. It allows referencing arbitrary local files as source for acquiring file data |
| 27 | +4. It allows extracting data "slices" from data sources, via offset and range |
| 28 | +5. It may store both encoded and decoded data sizes |
| 29 | +6. Various fields of additional per-file metadata may be defined, including various fields for writing data to GPT partition tables |
| 30 | +7. It's an extensible format permitting vendors and projects to add their own per-file and per-manifest fields. |
| 31 | + |
| 32 | +Among other things, to be generated by mkosi, and consumed by systemd-sysupdate. |
| 33 | + |
| 34 | +## Manifest Object |
| 35 | + |
| 36 | +This is the top-level object for a manifest file. It's designed to be |
| 37 | +extensible and self-identifies as manifest. |
| 38 | + |
| 39 | +```json |
| 40 | +{ |
| 41 | + "mediaType" : "application/vnd.uapi.manifest" |
| 42 | + "files" : [ … ] |
| 43 | +} |
| 44 | +``` |
| 45 | + |
| 46 | +### Notes |
| 47 | + |
| 48 | +The `files` field is just an array of file objects, see below. |
| 49 | + |
| 50 | +We might eventually add additional fields to the manifest, for in-line signatures and similar. |
| 51 | + |
| 52 | +## Files Object |
| 53 | + |
| 54 | +This is an object that encodes information about an individual file. |
| 55 | + |
| 56 | +```json |
| 57 | +{ |
| 58 | + "name" : "…", |
| 59 | + "dataEncoding" : "gzip" |
| 60 | + "encodedDataSize" : 2011, |
| 61 | + "dataSize" : 4711, |
| 62 | + "dataFile" : "…", |
| 63 | + "dataUrl" : "http://…" |
| 64 | + "dataLiteral" : "base64…" |
| 65 | + "sliceOffset" : 22, |
| 66 | + "sliceSize" : 33, |
| 67 | + "sha256" : "…", |
| 68 | + "gptLabel" : "…", |
| 69 | + "gptTypeUuid" : "…", |
| 70 | + "gptFlagNoAuto" : true, |
| 71 | + "gptFlagGrowFileSystem" : true, |
| 72 | + "readOnly" : true, |
| 73 | + "validFromUSec": 4711, |
| 74 | + "validUntilUSec" : 4711, |
| 75 | +} |
| 76 | +``` |
| 77 | + |
| 78 | +### Notes |
| 79 | + |
| 80 | +All fields but `name` are optional. |
| 81 | + |
| 82 | +The `name` must be a valid UTF-8 POSIX filename, may not contain control characters (ASCI 0…31, 127), may not |
| 83 | +contain a slash, and may not be identical to `.`, `..`, ``. It may have a maximum length of 255 bytes. |
| 84 | + |
| 85 | +Only one of `dataFile`, `dataUrl`, `dataLiteral` should be set. dataFile references a file in the same place |
| 86 | +as the manifest, dataUrl a full URL, and dataLiteral inline Base64. |
| 87 | + |
| 88 | +`sliceOffset`, `sliceSize` are apply to the *decoded* data blob. If not specified explicitly should default |
| 89 | +to a zero offset, and a slice size equal to the full decoded size. |
| 90 | + |
| 91 | +`dataSize` is the size of the decoded blob, `encodedDataSize` of the encoded blob. If `dataEncoding` is not |
| 92 | +specified, `encodedDataSize` should not be specified. |
| 93 | + |
| 94 | +`dataEncoding` should use the same encoding specifiers as HTTP. |
| 95 | + |
| 96 | +`sha256Sum` is the SHA256 hash of the specified slice, formatted in 64 hexadecimal characters. Parsers should |
| 97 | +parse this case-insensitively. |
| 98 | + |
| 99 | +The `gpt*` fields encode fields that we need when placing these resources in a GPT partition table. |
| 100 | + |
| 101 | +If `gptLabel` is unspecified, and the data is written to a GPT partition it should use the value of `name` as |
| 102 | +label. |
| 103 | + |
| 104 | +`readOnly` is relevant both for GPT partition table and should be reflected in the 'w' access bit when |
| 105 | +written to a regular file. |
| 106 | + |
| 107 | +`valid*USec` is in µs since UNIX epoch, defines a validity time window of the resource. The data should not |
| 108 | +be accessed or consumed outside of the specified time window. If `validFromUSec` is not specified it should |
| 109 | +default to 0, if `validToUSec` is not specified it should default to infinity. |
| 110 | + |
| 111 | +If neither `dataFile`, nor `dataUrl`, nor `dataLiteral` are specified, the `name` field should be used as |
| 112 | +fallback equivalent to it being repeated in `dataFile`. |
| 113 | + |
| 114 | +`dataLiteral` may be encoded either in regular Base64 or in URL-safe Base64, consumers must be able to deal |
| 115 | +with either. |
| 116 | + |
| 117 | +Additional hash algorithms may be defined in future. |
| 118 | + |
| 119 | +Sizes, offsets, timestamps shall use positive integer JSON numbers, and may use values above 2^52. (This |
| 120 | +means strictly speaking the format cannot be read by JSON parsers which exclusively use 64bit floating point |
| 121 | +for JSON numbers, if there are any files of such excessive sizes. Since it's unlikely that files included in |
| 122 | +a UAPI.16 manifest file are this large this should not be a practical limitation.) |
| 123 | + |
| 124 | +## Relationship to SHA256SUMS |
| 125 | + |
| 126 | +A UAPI manifest file that only uses the fields `name` and `sha256Sum` can be mapped 1:1 to a SHA256SUMS file |
| 127 | +and back. |
| 128 | + |
| 129 | +## Relationship to `mtree(5)` |
| 130 | + |
| 131 | +Manifests following this specification should be mappable 1:1 |
| 132 | +[`mtree(5)`](https://man.freebsd.org/cgi/man.cgi?mtree(5)) as long as: |
| 133 | + |
| 134 | +1. Only the `name`, `dataSize`, `sha256` fields are used in the files objects, as per this specification. |
| 135 | + |
| 136 | +2. Only the `size`, `sha256` fields are used in the `mtree(5)` files, as well as `type` is set to |
| 137 | + `file`. Only regular filenames may be specified as path, i.e. no `/` may be included. |
| 138 | + |
| 139 | +## Acquiring a File Remotely |
| 140 | + |
| 141 | +When acquiring a file listed in a UAPI.16 File Manifest from a web service, the following logic should be |
| 142 | +implemented. |
| 143 | + |
| 144 | +1. If `validFromUSec` is set and greater than the current time, the download should immediately fail. |
| 145 | +2. If `validUntilUSec` is set and lower than the current time, the download should immediately fail. |
| 146 | +3. If `dataLiteral` is set, it should be Base64 decoded, continue in step 7 |
| 147 | +4. Otherwise, if `dataUrl` is set, the data from the URL should be acquired, continue in step 7 |
| 148 | +5. Otherwise, if `dataFile` is set, the data should be acquired from the same URL as the manifest itself, however with the last component of the URL replaced by the file name encoded in `dataFile`, following the same semantics as HTML relative links. Continue in step 7. |
| 149 | +6. Otherwise, the data should be acquired as in step 3, but the `name` string should be used as file name to suffix the URL with. |
| 150 | +7. If `encodedDataSize` or `dataSize` are set: the size of the downloaded data shall be checked against `encodedDataSize` (if set) or `dataSize` (otherwise). |
| 151 | +8. If `encoding` is set: the downloaded data shall be decoded according to the algorithm indicated in `encoding`. |
| 152 | +9. if `dataSize` is set: the resulting decoded data shall be checked against `dataSize`. |
| 153 | +10. The byte range indicated by `sliceOffset` (if not set: 0) and `sliceSize` (if not set, to the end of the file) shall be extracted from the decoded data. |
| 154 | +11. The hash of the extracted data shall be matched against the hash encoded in `sha256Sum` (if set). |
| 155 | + |
| 156 | +If all these steps succeed the extracted data from step 10 is the result of the operation. |
| 157 | + |
| 158 | +Of course, many of the steps described above should typically be done together rather than serially for |
| 159 | +robustness, efficiency and security reasons. For example, if `encoding` is not used, it is recommended to |
| 160 | +include the `sliceOffset` and `sliceSize` fields in HTTP range request fields already (in order to avoid |
| 161 | +downloading redundant data). Moreover, downloads should fail immediately once the encoded or decoded data |
| 162 | +goes beyond the indicated encoded or decoded data sizes. Then, the decoding should be done on-the-fly while |
| 163 | +the data is downloaded, and the checksum be calculated on-the-fly too. |
| 164 | + |
| 165 | +## Extensibility |
| 166 | + |
| 167 | +Additional fields can be defined freely by implementors. Each such extension field should be named in the |
| 168 | +style of `x<Vendor>Foobar` to minimize risk of conflicts. Example `xAmutableFrobnicator` or |
| 169 | +`xMyProjectWeight`. |
| 170 | + |
| 171 | + |
| 172 | +## Example |
| 173 | + |
| 174 | +```json |
| 175 | +{ |
| 176 | + "mediaType" : "application/vnd.uapi.manifest", |
| 177 | + "files" : [ |
| 178 | + { |
| 179 | + "name" : "FooOS.raw", |
| 180 | + "dataEncoding": "gzip", |
| 181 | + "encodedDataSize": 5642649603, |
| 182 | + "dataSize" : 7523532800, |
| 183 | + "sha256" : "922a9bae0e02b4ffac3e5ed5054230d0689b9c2e25b0178ba82b925f2a0c3e48", |
| 184 | + "validUntilUSec" : 1776856773123234 |
| 185 | + }, |
| 186 | + { |
| 187 | + "name" : "FooOS_esp.raw", |
| 188 | + "dataFile" : "FooOS.raw", |
| 189 | + "dataEncoding": "gzip", |
| 190 | + "encodedDataSize": 5642649603, |
| 191 | + "dataSize" : 7523532800, |
| 192 | + "sliceOffset" : 2097152, |
| 193 | + "sliceSize" : 149175808, |
| 194 | + "gptLabel" : "EFI System Partition", |
| 195 | + "gptTypeUuid" : "c12a7328-f81f-11d2-ba4b-00a0c93ec93b", |
| 196 | + "sha256" : "5dcfd837a4868550cc61c256d9567a974e32a20985afa9e100b8b96755a20cae", |
| 197 | + "validUntilUSec" : 1776856773123234 |
| 198 | + }, |
| 199 | + { |
| 200 | + "name" : "FooOS_root.raw", |
| 201 | + "dataFile" : "FooOS.raw", |
| 202 | + "dataEncoding": "gzip", |
| 203 | + "encodedDataSize": 5642649603, |
| 204 | + "dataSize" : 7523532800, |
| 205 | + "sliceOffset" : 351272960, |
| 206 | + "sliceSize" : 234003200, |
| 207 | + "sha256" : "00b70a0813e15f309828d3a36156283cba87576c26755e0b2d4cf0951eff8163", |
| 208 | + "gptLabel" : "FooOS_root", |
| 209 | + "gptTypeUuid": "4f68bce3-e8cd-4db1-96e7-fbcaf984b709", |
| 210 | + "readOnly": true, |
| 211 | + "validUntilUSec" : 1776856773123234 |
| 212 | + } |
| 213 | + ] |
| 214 | +} |
| 215 | +``` |
| 216 | + |
| 217 | +The above provides three separate files `FooOS.raw`, `FooOS-esp.raw`, `FooOS-root.raw`. The latter two are |
| 218 | +defined as slices of the former, each encapsulating an individual partition. The data file is encoded via |
| 219 | +`gzip`. |
0 commit comments