Skip to content

Commit 13fa623

Browse files
Merge pull request #9804 from ThomasWaldmann/interals-docs-update
docs: update / fix "internals" section
2 parents dd67bf3 + 66d010c commit 13fa623

2 files changed

Lines changed: 58 additions & 18 deletions

File tree

docs/internals/data-structures.rst

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,14 @@ data/
5151
0000... .. ffff...
5252

5353
keys/
54-
When using encryption in repokey mode, the encrypted, passphrase protected
55-
key is stored here as a base64 encoded text. The sha256 content hash is
56-
used for the name.
54+
When using repokey mode, the encrypted, passphrase protected borg keys are
55+
stored here as a base64 encoded text. The sha256 content hash of the
56+
stored borg key is used for the name.
57+
58+
A repository may contain *multiple* such borg keys (one per passphrase) to
59+
support the :ref:`multiple borg keys <borgcrypto_multiple_keys>` feature.
60+
keyfile and repokey borg keys use the same format and naming (only the
61+
storage location differs).
5762

5863
locks/
5964
used by the locking system to manage shared and exclusive locks.
@@ -67,7 +72,10 @@ byte strings of fixed length (256-bit, 32 bytes), computed like this::
6772

6873
key = id = id_hash(plaintext_data) # plain = not encrypted, not compressed, not obfuscated
6974

70-
The id_hash function depends on the :ref:`encryption mode <borg_repo-create>`.
75+
The id_hash function is selected via ``borg repo-create --id-hash`` (independently
76+
of ``--encryption``). For encrypted repositories it is a keyed MAC over the
77+
plaintext (keyed by ``id_key``): ``sha256`` selects HMAC-SHA256, ``blake3``
78+
selects a keyed BLAKE3. The unencrypted ``none`` mode uses a plain ``sha256``.
7179

7280
As the id / key is used for deduplication, id_hash must be a cryptographically
7381
strong hash or MAC.
@@ -718,11 +726,15 @@ Both modes
718726

719727
Encryption keys (and other secrets) are kept either in the keys directory on
720728
the client ('keyfile' mode) or under the keys/ namespace in the repository
721-
('repokey' mode) using the sha256 of the file content as the name.
729+
('repokey' mode) using the sha256 of the borg key content as the name.
722730

723731
In both cases, the secrets are generated from random and then encrypted by a
724732
key derived from your passphrase (this happens on the client before the key
725-
is stored into the keyfile or as repokey).
733+
is stored as keyfile or repokey).
734+
735+
keyfile and repokey borg keys use the **same** format; only the storage location
736+
differs. Borg finds the correct key by trying each key against the supplied
737+
passphrase. See :ref:`borgcrypto_multiple_keys`.
726738

727739
The passphrase is passed through the ``BORG_PASSPHRASE`` environment variable
728740
or prompted for interactive usage.

docs/internals/security.rst

Lines changed: 40 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -116,22 +116,27 @@ Encryption
116116
AEAD modes
117117
~~~~~~~~~~
118118

119-
Modes: --encryption (repokey|keyfile)-[blake2-](aes-ocb|chacha20-poly1305)
119+
Modes: ``--encryption (aes256-ocb|chacha20-poly1305)`` plus
120+
``--id-hash (sha256|blake3)``
120121

121122
Supported: borg 2.0+
122123

124+
The cipher is selected by ``--encryption`` (see :ref:`borg_repo-create`), the
125+
key storage location (repokey or keyfile) by ``--key-location``, and the chunk
126+
ID hash function by ``--id-hash`` — these three are orthogonal.
127+
123128
Encryption with these modes is based on AEAD ciphers (authenticated encryption
124129
with associated data) and session keys.
125130

126-
Depending on the chosen mode (see :ref:`borg_repo-create`) different AEAD ciphers are used:
131+
Depending on the chosen mode different AEAD ciphers are used:
127132

128133
- AES-256-OCB - super fast, single-pass algorithm IF you have hw accelerated AES.
129134
- chacha20-poly1305 - very fast, purely software based AEAD cipher.
130135

131136
The chunk ID is derived via a MAC over the plaintext (mac key taken from borg key):
132137

133-
- HMAC-SHA256 - super fast IF you have hw accelerated SHA256 (see section "Encryption" below).
134-
- Blake2b - very fast, purely software based algorithm.
138+
- HMAC-SHA256 (``--id-hash sha256``) - super fast IF you have hw accelerated SHA256 (see section "Encryption" below).
139+
- keyed BLAKE3 (``--id-hash blake3``) - very fast, purely software based algorithm.
135140

136141
For each borg invocation, a new session id is generated by `os.urandom`_.
137142

@@ -177,8 +182,8 @@ Decryption::
177182
Notable:
178183

179184
- More modern and often faster AEAD ciphers instead of self-assembled stuff.
180-
- Due to the usage of session keys, IVs (nonces) do not need special care here as
181-
they did for the legacy encryption modes.
185+
- Due to the usage of session keys, which just start at 0 per session, IVs (nonces)
186+
do not need long-term special care here as they did for the legacy encryption modes.
182187
- The id is now also input into the authentication tag computation.
183188
This strongly associates the id with the written data (== associates the key with
184189
the value). When later reading the data for some id, authentication will only
@@ -188,11 +193,14 @@ Notable:
188193
Legacy modes
189194
~~~~~~~~~~~~
190195

191-
Modes: --encryption (repokey|keyfile)-[blake2]
196+
Modes: ``--encryption (repokey|keyfile)[-blake2]``
192197

193198
Supported: borg < 2.0
194199

195-
These were the AES-CTR based modes in previous borg versions.
200+
These were the AES-CTR based modes in previous borg versions, with the chunk ID
201+
derived via HMAC-SHA256 or (in the ``-blake2`` variants) Blake2b. ``blake2b`` is
202+
only used by these legacy modes; new repositories use ``sha256`` or ``blake3``
203+
(see above).
196204

197205
borg 2.0 does not support creating new repos using these modes,
198206
but ``borg transfer`` can still read such existing repos.
@@ -215,13 +223,30 @@ to Encrypt-*then*-MAC a packed representation of the keys using the
215223
chacha20-poly1305 AEAD cipher and a constant IV == 0.
216224
The ciphertext is then converted to base64.
217225

218-
This base64 blob (commonly referred to as *keyblob*) is then stored in
219-
the key file or in the repository config (keyfile and repokey modes
220-
respectively).
226+
This base64-encoded *borg key* is then stored in the key file or under the
227+
repository's ``keys/`` namespace (keyfile and repokey modes respectively), named
228+
by the sha256 of its content.
221229

222230
The use of a constant IV is secure because an identical passphrase will
223231
result in a different derived KEK for every key encryption due to the salt.
224232

233+
.. _borgcrypto_multiple_keys:
234+
235+
Multiple borg keys
236+
~~~~~~~~~~~~~~~~~~
237+
238+
A repository (or a client-side keyfile directory) may hold *multiple* borg keys,
239+
each encrypted with its own passphrase but all wrapping the **same** underlying
240+
key material. This lets several people access a shared repository with
241+
independent passphrases, without sharing one secret. Or you can add borg keys
242+
for redundant, more fault-tolerant storage.
243+
244+
keyfile and repokey borg keys use the same format and the same sha256-content
245+
naming; borg locates a borg key independently of its key type byte and tries each
246+
available one against the supplied passphrase until one decrypts. A borg key may
247+
carry a label for management. The constant-IV argument above still holds, because
248+
each borg key has its own random argon2 salt and therefore a distinct derived KEK.
249+
225250

226251
.. seealso::
227252

@@ -239,13 +264,16 @@ on widely used libraries providing them:
239264
We think this is not an additional risk, since we don't ever
240265
use OpenSSL's networking, TLS or X.509 code, but only their
241266
primitives implemented in libcrypto.
242-
- SHA-256, SHA-512 and BLAKE2b from Python's hashlib_ standard library module are used.
267+
- SHA-256 and SHA-512 from Python's hashlib_ standard library module are used.
268+
- BLAKE3 is used via the blake3_ package (new repos, ``--id-hash blake3``).
269+
- BLAKE2b from Python's hashlib_ is only used to read legacy (borg < 2.0) repos.
243270
- HMAC and a constant-time comparison from Python's hmac_ standard library module are used.
244271
- argon2 is used via argon2-cffi.
245272

246273
.. _Horton principle: https://en.wikipedia.org/wiki/Horton_Principle
247274
.. _length extension: https://en.wikipedia.org/wiki/Length_extension_attack
248275
.. _hashlib: https://docs.python.org/3/library/hashlib.html
276+
.. _blake3: https://pypi.org/project/blake3/
249277
.. _hmac: https://docs.python.org/3/library/hmac.html
250278
.. _os.urandom: https://docs.python.org/3/library/os.html#os.urandom
251279

0 commit comments

Comments
 (0)