Skip to content

Commit b4dfeff

Browse files
committed
Merge branch 'tb/incremental-midx-part-3.3' into seen
The repacking code has been refactored and compaction of MIDX layers have been implemented, and incremental strategy that does not require all-into-one repacking has been introduced. * tb/incremental-midx-part-3.3: repack: allow `--write-midx=incremental` without `--geometric` repack: introduce `--write-midx=incremental` repack: implement incremental MIDX repacking packfile: ensure `close_pack_revindex()` frees in-memory revindex builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK` repack-geometry: prepare for incremental MIDX repacking repack-midx: extract `repack_fill_midx_stdin_packs()` repack-midx: factor out `repack_prepare_midx_command()` midx: expose `midx_layer_contains_pack()` repack: track the ODB source via existing_packs midx: support custom `--base` for incremental MIDX writes midx: introduce `--checksum-only` for incremental MIDX writes midx: use `strvec` for `keep_hashes` strvec: introduce `strvec_init_alloc()` midx: use `string_list` for retained MIDX files midx-write: handle noop writes when converting incremental chains
2 parents 8d46ad4 + 1b17f64 commit b4dfeff

19 files changed

+1792
-156
lines changed

Documentation/config/repack.adoc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,21 @@ repack.midxMustContainCruft::
4646
`--write-midx`. When false, cruft packs are only included in the MIDX
4747
when necessary (e.g., because they might be required to form a
4848
reachability closure with MIDX bitmaps). Defaults to true.
49+
50+
repack.midxSplitFactor::
51+
The factor used in the geometric merging condition when
52+
compacting incremental MIDX layers during `git repack` when
53+
invoked with the `--write-midx=incremental` option.
54+
+
55+
Adjacent layers are merged when the accumulated object count of the
56+
newer layer exceeds `1/<N>` of the object count of the next deeper
57+
layer. Defaults to 2.
58+
59+
repack.midxNewLayerThreshold::
60+
The minimum number of packs in the tip MIDX layer before those
61+
packs are considered as candidates for geometric repacking
62+
during `git repack --write-midx=incremental`.
63+
+
64+
When the tip layer has fewer packs than this threshold, those packs are
65+
excluded from the geometric repack entirely, and are thus left
66+
unmodified. Defaults to 8.

Documentation/git-multi-pack-index.adoc

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,11 @@ SYNOPSIS
1111
[verse]
1212
'git multi-pack-index' [<options>] write [--preferred-pack=<pack>]
1313
[--[no-]bitmap] [--[no-]incremental] [--[no-]stdin-packs]
14-
[--refs-snapshot=<path>]
14+
[--refs-snapshot=<path>] [--[no-]checksum-only]
15+
[--base=<checksum>]
1516
'git multi-pack-index' [<options>] compact [--[no-]incremental]
16-
[--[no-]bitmap] <from> <to>
17+
[--[no-]bitmap] [--base=<checksum>] [--[no-]checksum-only]
18+
<from> <to>
1719
'git multi-pack-index' [<options>] verify
1820
'git multi-pack-index' [<options>] expire
1921
'git multi-pack-index' [<options>] repack [--batch-size=<size>]
@@ -83,6 +85,13 @@ marker).
8385
and packs not present in an existing MIDX layer.
8486
Migrates non-incremental MIDXs to incremental ones when
8587
necessary.
88+
89+
--base=<checksum>::
90+
Specify the checksum of an existing MIDX layer to use
91+
as the base when writing a new incremental layer.
92+
The special value `none` indicates that the new layer
93+
should have no base (i.e., it becomes a root layer).
94+
Requires `--checksum-only`.
8695
--
8796

8897
compact::
@@ -97,6 +106,12 @@ compact::
97106

98107
--[no-]bitmap::
99108
Control whether or not a multi-pack bitmap is written.
109+
110+
--base=<checksum>::
111+
Specify the checksum of an existing MIDX layer to use
112+
as the base for the compacted result, instead of using
113+
the immediate parent of `<from>`. The special value
114+
`none` indicates that the result should have no base.
100115
--
101116

102117
verify::

Documentation/git-repack.adoc

Lines changed: 41 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ SYNOPSIS
1111
[verse]
1212
'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m]
1313
[--window=<n>] [--depth=<n>] [--threads=<n>] [--keep-pack=<pack-name>]
14-
[--write-midx] [--name-hash-version=<n>] [--path-walk]
14+
[--write-midx[=<mode>]] [--name-hash-version=<n>] [--path-walk]
1515

1616
DESCRIPTION
1717
-----------
@@ -250,9 +250,47 @@ pack as the preferred pack for object selection by the MIDX (see
250250
linkgit:git-multi-pack-index[1]).
251251

252252
-m::
253-
--write-midx::
253+
--write-midx[=<mode>]::
254254
Write a multi-pack index (see linkgit:git-multi-pack-index[1])
255-
containing the non-redundant packs.
255+
containing the non-redundant packs. The following modes are
256+
available:
257+
+
258+
--
259+
`default`;;
260+
Write a single MIDX covering all packs. This is the
261+
default when `--write-midx` is given without an
262+
explicit mode.
263+
264+
`incremental`;;
265+
Write an incremental MIDX chain instead of a single
266+
flat MIDX.
267+
+
268+
Without `--geometric`, a new MIDX layer is appended to the existing
269+
chain (or a new chain is started) containing whatever packs were written
270+
by the repack. Existing layers are preserved as-is.
271+
+
272+
When combined with `--geometric`, the incremental mode maintains a chain
273+
of MIDX layers that is compacted over time using a geometric merging
274+
strategy. Each repack creates a new tip layer containing the newly
275+
written pack(s). Adjacent layers are then merged whenever the newer
276+
layer's object count exceeds `1/repack.midxSplitFactor` of the next
277+
deeper layer's count. Layers that do not meet this condition are
278+
retained as-is.
279+
+
280+
The result is that newer (tip) layers tend to contain many small packs
281+
with relatively few objects, while older (deeper) layers contain fewer,
282+
larger packs covering more objects. Because compaction is driven by the
283+
tip of the chain, newer layers are also rewritten more frequently than
284+
older ones, which are only touched when enough objects have accumulated
285+
to justify merging into them. This keeps the total number of layers
286+
logarithmic relative to the total number of objects.
287+
+
288+
Only packs in the tip MIDX layer are considered as candidates for the
289+
geometric repack; packs in deeper layers are left untouched. If the tip
290+
layer contains fewer packs than `repack.midxNewLayerThreshold`, those
291+
packs are excluded from the geometry entirely, and a new layer is
292+
created for any new pack(s) without disturbing the existing chain.
293+
--
256294

257295
--name-hash-version=<n>::
258296
Provide this argument to the underlying `git pack-objects` process.

builtin/multi-pack-index.c

Lines changed: 44 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,13 @@
1616
#define BUILTIN_MIDX_WRITE_USAGE \
1717
N_("git multi-pack-index [<options>] write [--preferred-pack=<pack>]\n" \
1818
" [--[no-]bitmap] [--[no-]incremental] [--[no-]stdin-packs]\n" \
19-
" [--refs-snapshot=<path>]")
19+
" [--refs-snapshot=<path>] [--[no-]checksum-only]\n" \
20+
" [--base=<checksum>]")
2021

2122
#define BUILTIN_MIDX_COMPACT_USAGE \
2223
N_("git multi-pack-index [<options>] compact [--[no-]incremental]\n" \
23-
" [--[no-]bitmap] <from> <to>")
24+
" [--[no-]bitmap] [--base=<checksum>] [--[no-]checksum-only]\n" \
25+
" <from> <to>")
2426

2527
#define BUILTIN_MIDX_VERIFY_USAGE \
2628
N_("git multi-pack-index [<options>] verify")
@@ -63,6 +65,7 @@ static char const * const builtin_multi_pack_index_usage[] = {
6365
static struct opts_multi_pack_index {
6466
char *object_dir;
6567
const char *preferred_pack;
68+
const char *incremental_base;
6669
char *refs_snapshot;
6770
unsigned long batch_size;
6871
unsigned flags;
@@ -151,8 +154,13 @@ static int cmd_multi_pack_index_write(int argc, const char **argv,
151154
N_("pack for reuse when computing a multi-pack bitmap")),
152155
OPT_BIT(0, "bitmap", &opts.flags, N_("write multi-pack bitmap"),
153156
MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX),
157+
OPT_STRING(0, "base", &opts.incremental_base, N_("checksum"),
158+
N_("base MIDX for incremental writes")),
154159
OPT_BIT(0, "incremental", &opts.flags,
155160
N_("write a new incremental MIDX"), MIDX_WRITE_INCREMENTAL),
161+
OPT_BIT(0, "checksum-only", &opts.flags,
162+
N_("write a MIDX layer without updating the MIDX chain"),
163+
MIDX_WRITE_CHECKSUM_ONLY),
156164
OPT_BOOL(0, "stdin-packs", &opts.stdin_packs,
157165
N_("write multi-pack index containing only given indexes")),
158166
OPT_FILENAME(0, "refs-snapshot", &opts.refs_snapshot,
@@ -178,6 +186,22 @@ static int cmd_multi_pack_index_write(int argc, const char **argv,
178186
if (argc)
179187
usage_with_options(builtin_multi_pack_index_write_usage,
180188
options);
189+
190+
if (opts.flags & MIDX_WRITE_CHECKSUM_ONLY &&
191+
!(opts.flags & MIDX_WRITE_INCREMENTAL)) {
192+
error(_("cannot use %s without %s"),
193+
"--checksum-only", "--incremental");
194+
usage_with_options(builtin_multi_pack_index_write_usage,
195+
options);
196+
}
197+
198+
if (opts.incremental_base &&
199+
!(opts.flags & MIDX_WRITE_CHECKSUM_ONLY)) {
200+
error(_("cannot use --base without --checksum-only"));
201+
usage_with_options(builtin_multi_pack_index_write_usage,
202+
options);
203+
}
204+
181205
source = handle_object_dir_option(repo);
182206

183207
FREE_AND_NULL(options);
@@ -189,7 +213,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv,
189213

190214
ret = write_midx_file_only(source, &packs,
191215
opts.preferred_pack,
192-
opts.refs_snapshot, opts.flags);
216+
opts.refs_snapshot,
217+
opts.incremental_base, opts.flags);
193218

194219
string_list_clear(&packs, 0);
195220
free(opts.refs_snapshot);
@@ -217,10 +242,15 @@ static int cmd_multi_pack_index_compact(int argc, const char **argv,
217242

218243
struct option *options;
219244
static struct option builtin_multi_pack_index_compact_options[] = {
245+
OPT_STRING(0, "base", &opts.incremental_base, N_("checksum"),
246+
N_("base MIDX for incremental writes")),
220247
OPT_BIT(0, "bitmap", &opts.flags, N_("write multi-pack bitmap"),
221248
MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX),
222249
OPT_BIT(0, "incremental", &opts.flags,
223250
N_("write a new incremental MIDX"), MIDX_WRITE_INCREMENTAL),
251+
OPT_BIT(0, "checksum-only", &opts.flags,
252+
N_("write a MIDX layer without updating the MIDX chain"),
253+
MIDX_WRITE_CHECKSUM_ONLY),
224254
OPT_END(),
225255
};
226256

@@ -239,6 +269,15 @@ static int cmd_multi_pack_index_compact(int argc, const char **argv,
239269
if (argc != 2)
240270
usage_with_options(builtin_multi_pack_index_compact_usage,
241271
options);
272+
273+
if (opts.flags & MIDX_WRITE_CHECKSUM_ONLY &&
274+
!(opts.flags & MIDX_WRITE_INCREMENTAL)) {
275+
error(_("cannot use %s without %s"),
276+
"--checksum-only", "--incremental");
277+
usage_with_options(builtin_multi_pack_index_compact_usage,
278+
options);
279+
}
280+
242281
source = handle_object_dir_option(the_repository);
243282

244283
FREE_AND_NULL(options);
@@ -266,7 +305,8 @@ static int cmd_multi_pack_index_compact(int argc, const char **argv,
266305
die(_("MIDX %s must be an ancestor of %s"), argv[0], argv[1]);
267306
}
268307

269-
ret = write_midx_file_compact(source, from_midx, to_midx, opts.flags);
308+
ret = write_midx_file_compact(source, from_midx, to_midx,
309+
opts.incremental_base, opts.flags);
270310

271311
return ret;
272312
}

0 commit comments

Comments
 (0)