Skip to content

Commit e222cf7

Browse files
committed
period yeet
1 parent d56dc78 commit e222cf7

4 files changed

Lines changed: 27 additions & 27 deletions

File tree

docs/chunking/fixed_size.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,22 +6,22 @@ title: 'Fixed-Size Chunking'
66

77
Fixed-size chunking is currently a proof of concept and is in alpha status.
88
It is not recommended for production use.
9-
Use [CDC Chunking](chunking_cdc.md) instead unless you have a specific reason to use fixed-size chunking.
9+
Use [CDC Chunking](chunking_cdc.md) instead unless you have a specific reason to use fixed-size chunking
1010

1111
Fixed-size chunking splits files at predictable byte-offset boundaries, with every chunk being exactly the configured size
12-
(the last chunk may be smaller if the file size is not a multiple of the chunk size).
12+
(the last chunk may be smaller if the file size is not a multiple of the chunk size)
1313

1414
## What Fixed-Size Chunking Is
1515

1616
Fixed-size chunking is conceptually simple: the file is divided into equal-sized pieces from start to end.
17-
Each piece is hashed and stored independently, just like CDC chunks.
17+
Each piece is hashed and stored independently, just like CDC chunks
1818

1919
Unlike CDC, chunk boundaries do not shift when content is inserted or deleted in the middle of the file.
2020
Any edit before the end of a chunk changes that chunk's hash entirely, and any insertion or deletion causes all subsequent chunks to shift,
21-
potentially invalidating a large number of previously stored chunks.
21+
potentially invalidating a large number of previously stored chunks
2222

2323
This means fixed-size chunking is generally inferior to CDC for files with arbitrary edits.
24-
Its benefit is only realized in scenarios where the file's write pattern is well-aligned to chunk boundaries.
24+
Its benefit is only realized in scenarios where the file's write pattern is well-aligned to chunk boundaries
2525

2626
For example, with `fixed_4k` applied to a Minecraft region file:
2727

@@ -36,7 +36,7 @@ For example, with `fixed_4k` applied to a Minecraft region file:
3636

3737
Each 4 KiB chunk corresponds to one internal page of the region file.
3838
When only a few game chunks change between backups, only the corresponding pages are dirtied,
39-
and the rest of the chunks are identical to those already stored.
39+
and the rest of the chunks are identical to those already stored
4040

4141
## Available Algorithms
4242

@@ -50,25 +50,25 @@ and the rest of the chunks are identical to those already stored.
5050

5151
The 4KiB chunk size aligns with the internal page structure of Minecraft's Anvil region files (`.mca`).
5252
In theory, modifying a small number of chunks in the game only dirties a limited number of 4 KiB pages,
53-
making `fixed_4k` capable of the finest-grained deduplication for region files.
53+
making `fixed_4k` capable of the finest-grained deduplication for region files
5454

5555
However, `fixed_4k` has serious practical drawbacks:
5656

5757
- extremely high metadata overhead: a 1 GiB file requires roughly 262 144 chunk records
5858
- poor I/O performance: each chunk requires a separate read-write cycle during backup
5959

60-
Unless the file is very large and only a tiny number of pages change per backup, `fixed_4k` is unlikely to be worth the cost.
60+
Unless the file is very large and only a tiny number of pages change per backup, `fixed_4k` is unlikely to be worth the cost
6161

6262
### fixed_32k
6363

64-
A middle-ground option. Metadata overhead is 32× lower than `fixed_4k` but granularity is also much coarser.
64+
A middle-ground option. Metadata overhead is 32× lower than `fixed_4k` but granularity is also much coarser
6565

6666
### fixed_128k
6767

6868
The 128 KiB chunk size is well-suited for files that grow by appending data at the end.
69-
When new data is appended, only the trailing chunks change; all preceding chunks retain the same hash and are reused.
69+
When new data is appended, only the trailing chunks change; all preceding chunks retain the same hash and are reused
7070

71-
This makes `fixed_128k` a reasonable alternative to CDC for pure append-write files.
71+
This makes `fixed_128k` a reasonable alternative to CDC for pure append-write files
7272

7373
## Poor Candidates
7474

@@ -81,5 +81,5 @@ Fixed-size chunking is a poor choice for:
8181
## No Extra Dependencies
8282

8383
Fixed-size chunking has no additional Python dependency requirements.
84-
It is available as long as Prime Backup is installed.
84+
It is available as long as Prime Backup is installed
8585

docs/chunking/index.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ Split large files into smaller chunks for better deduplication across backups
88

99
File chunking is a storage strategy where a large file is split into smaller pieces called chunks before being stored.
1010
Each chunk is hashed and deduplicated independently, so when only part of a large file changes between backups,
11-
only the modified chunks need to be written anew. The unchanged chunks are reused directly from existing storage.
11+
only the modified chunks need to be written anew. The unchanged chunks are reused directly from existing storage
1212

1313
In Prime Backup, restoring a chunked file is transparent to users.
14-
The original file is reconstructed automatically when the backup is read or exported.
14+
The original file is reconstructed automatically when the backup is read or exported
1515

1616
## When It Is Applied
1717

@@ -25,7 +25,7 @@ A rule matches when both conditions are true:
2525
- the file size is at least `file_size_threshold`
2626
- the file path relative to `source_root` matches the rule's `patterns`
2727

28-
If no rule matches, the file is stored as a regular direct blob without chunking.
28+
If no rule matches, the file is stored as a regular direct blob without chunking
2929

3030
The default configuration is:
3131

@@ -45,11 +45,11 @@ The default configuration is:
4545
```
4646

4747
Changing these options only affects files newly stored in future backups.
48-
Existing direct blobs or chunked blobs will not be converted automatically.
48+
Existing direct blobs or chunked blobs will not be converted automatically
4949

5050
## How It Is Stored
5151

52-
Prime Backup still creates one blob record for the whole file, but the blob uses the `chunked` storage method instead of `direct`.
52+
Prime Backup still creates one blob record for the whole file, but the blob uses the `chunked` storage method instead of `direct`
5353

5454
The current implementation works in the following order:
5555

@@ -78,13 +78,13 @@ so the implementation groups consecutive chunks into chunk groups and stores two
7878
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
7979
```
8080

81-
This reduces metadata overhead without changing the logical model.
81+
This reduces metadata overhead without changing the logical model
8282

83-
Chunk hashes and chunk group hashes always use `blake3`, while the whole-file blob hash still follows `backup.hash_method`.
83+
Chunk hashes and chunk group hashes always use `blake3`, while the whole-file blob hash still follows `backup.hash_method`
8484

8585
## Compression and Performance
8686

87-
Chunking does not disable compression.
87+
Chunking does not disable compression
8888

8989
For a chunked blob:
9090

@@ -94,7 +94,7 @@ For a chunked blob:
9494

9595
Compared with direct blob storage, chunked storage is slower on the first backup of a file,
9696
because Prime Backup needs extra work to cut the file, calculate hashes, and process each chunk.
97-
The benefit becomes apparent on subsequent backups where many chunks can be reused.
97+
The benefit becomes apparent on subsequent backups where many chunks can be reused
9898

9999
## Available Algorithms
100100

@@ -114,7 +114,7 @@ See the detailed pages for each approach:
114114
## Observation
115115

116116
Prime Backup maintenance logic already understands chunked storage.
117-
You can inspect the effect with `!!pb database overview`, which includes a dedicated chunk statistics section.
117+
You can inspect the effect with `!!pb database overview`, which includes a dedicated chunk statistics section
118118

119119
If Prime Backup finds that one chunked file produced many brand new chunks in a single backup, it will emit a warning in logs.
120-
That usually means the file is not a good chunking target, unless this is the first backup containing that file.
120+
That usually means the file is not a good chunking target, unless this is the first backup containing that file

docs/feature/database_inspect.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Database Internal Object Inspection'
33
---
44

5-
View and inspect object information in the database.
5+
View and inspect object information in the database
66

77
## Overview
88

@@ -87,7 +87,7 @@ View detailed information for a specific file in a fileset:
8787
!!pb database inspect file2 <fileset_id> <file_path>
8888
```
8989

90-
Displays the same content as above.
90+
Displays the same content as above
9191

9292
## Fileset Inspection
9393

@@ -129,7 +129,7 @@ View complete information for a specific blob:
129129
!!pb database inspect blob <hash>
130130
```
131131

132-
The parameter `<hash>` can be a prefix of the complete hash string, as long as it uniquely identifies the object.
132+
The parameter `<hash>` can be a prefix of the complete hash string, as long as it uniquely identifies the object
133133

134134
Example:
135135

docs/feature/database_operation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Database Operations'
33
---
44

5-
Configuration migration and data modification operations.
5+
Configuration migration and data modification operations
66

77
## Hash Algorithm Migration
88

0 commit comments

Comments
 (0)