Skip to content

Commit 835e0aa

Browse files
pks-tgitster
authored andcommitted
builtin/pack-objects: reduce lock contention when writing packfile data
When running `git pack-objects --stdout` we feed the data through `hashfd_ext()` with a progress meter and a smaller-than-usual buffer length of 8kB so that we can track throughput more granularly. But as packfiles tend to be on the larger side, this small buffer size may cause a ton of write(3p) syscalls. Originally, the buffer we used in `hashfd()` was 8kB for all use cases. This was changed though in 2ca245f (csum-file.h: increase hashfile buffer size, 2021-05-18) because we noticed that the number of writes can have an impact on performance. So the buffer size was increased to 128kB, which improved performance a bit for some use cases. But the commit didn't touch the buffer size for `hashd_throughput()`. The reasoning here was that callers expect the progress indicator to update frequently, and a larger buffer size would of course reduce the update frequency especially on slow networks. While that is of course true, there was (and still is, even though it's now a call to `hashfd_ext()`) only a single caller of this function in git-pack-objects(1). This command is responsible for writing packfiles, and those packfiles are often on the bigger side. So arguably: - The user won't care about increments of 8kB when packfiles tend to be megabytes or even gigabytes in size. - Reducing the number of syscalls would be even more valuable here than it would be for multi-pack indices, which was the benchmark done in the mentioned commit, as MIDXs are typically significantly smaller than packfiles. - Nowadays, many internet connections should be able to transfer data at a rate significantly higher than 8kB per second. Update the buffer to instead have a size of `LARGE_PACKET_DATA_MAX - 1`, which translates to ~64kB. This limit was chosen because `git pack-objects --stdout` is most often used when sending packfiles via git-upload-pack(1), where packfile data is chunked into pktlines when using the sideband. Furthermore, most internet connections should have a bandwidth signifcantly higher than 64kB/s, so we'd still be able to observe progress updates at a rate of at least once per second. This change significantly reduces the number of write(3p) syscalls from 355,000 to 44,000 when packing the Linux repository. While this results in a small performance improvement on an otherwise-unused system, this improvement is mostly negligible. More importantly though, it will reduce lock contention in the kernel on an extremely busy system where we have many processes writing data at once. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 parent 2bf8f36 commit 835e0aa

File tree

1 file changed

+9
-5
lines changed

1 file changed

+9
-5
lines changed

builtin/pack-objects.c

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@
4141
#include "promisor-remote.h"
4242
#include "pack-mtimes.h"
4343
#include "parse-options.h"
44+
#include "pkt-line.h"
4445
#include "blob.h"
4546
#include "tree.h"
4647
#include "path-walk.h"
@@ -1333,14 +1334,17 @@ static void write_pack_file(void)
13331334

13341335
if (pack_to_stdout) {
13351336
/*
1336-
* Since we are expecting to report progress of the
1337-
* write into this hashfile, use a smaller buffer
1338-
* size so the progress indicators arrive at a more
1339-
* frequent rate.
1337+
* This command is most often invoked via
1338+
* git-upload-pack(1), which will typically chunk data
1339+
* into pktlines. As such, we use the maximum data
1340+
* length of them as buffer length.
1341+
*
1342+
* Note that we need to subtract one though to
1343+
* accomodate for the sideband byte.
13401344
*/
13411345
struct hashfd_options opts = {
13421346
.progress = progress_state,
1343-
.buffer_len = 8 * 1024,
1347+
.buffer_len = LARGE_PACKET_DATA_MAX - 1,
13441348
};
13451349
f = hashfd_ext(the_repository->hash_algo, 1,
13461350
"<stdout>", &opts);

0 commit comments

Comments
 (0)