From f4eff7116dabac7e4f5a6aad8a74feaebdcb8342 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 11 Feb 2026 13:44:59 +0100 Subject: [PATCH] builtin/pack-objects: don't fetch objects when merging packs The "--stdin-packs" option can be used to merge objects from multiple packfiles given via stdin into a new packfile. One big upside of this option is that we don't have to perform a complete rev walk to enumerate objects. Instead, we can simply enumerate all objects that are part of the specified packfiles, which can be significantly faster in very large repositories. There is one downside though: when we don't perform a rev walk we also don't have a good way to learn about the respective object's names. As a consequence, we cannot use the name hashes as a heuristic to get better delta selection. We try to offset this downside though by performing a localized rev walk: we queue all objects that we're about to repack as interesting, and all objects from excluded packfiles as uninteresting. We then perform a best-effort rev walk that allows us to fill in object names. There is one gotcha here though: when "--exclude-promisor-objects" has not been given we will perform backfill fetches for any promised objects that are missing. This used to not be an issue though as this option was mutually exclusive with "--stdin-packs". But that has changed recently, and starting with dcc9c7ef47 (builtin/repack: handle promisor packs with geometric repacking, 2026-01-05) we will now repack promisor packs during geometric compaction. The consequence is that a geometric repack may now perform a bunch of backfill fetches. We of course cannot pass "--exclude-promisor-objects" to fix this issue -- after all, the whole intent is to repack objects part of a promisor pack. But arguably we don't have to: the rev walk is intended as best effort, and we already configure it to ignore missing links to other objects. So we can adapt the walk to unconditionally disable fetching any missing objects. Do so and add a test that verifies we don't backfill any objects. Reported-by: Lukas Wanko Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/pack-objects.c | 10 ++++++++++ t/t5331-pack-objects-stdin.sh | 18 ++++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5846b6a293..0a4a358096 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3927,8 +3927,16 @@ static void add_unreachable_loose_objects(struct rev_info *revs); static void read_stdin_packs(enum stdin_packs_mode mode, int rev_list_unpacked) { + int prev_fetch_if_missing = fetch_if_missing; struct rev_info revs; + /* + * The revision walk may hit objects that are promised, only. As the + * walk is best-effort though we don't want to perform backfill fetches + * for them. + */ + fetch_if_missing = 0; + repo_init_revisions(the_repository, &revs, NULL); /* * Use a revision walk to fill in the namehash of objects in the include @@ -3964,6 +3972,8 @@ static void read_stdin_packs(enum stdin_packs_mode mode, int rev_list_unpacked) stdin_packs_found_nr); trace2_data_intmax("pack-objects", the_repository, "stdin_packs_hints", stdin_packs_hints_nr); + + fetch_if_missing = prev_fetch_if_missing; } static void add_cruft_object_entry(const struct object_id *oid, enum object_type type, diff --git a/t/t5331-pack-objects-stdin.sh b/t/t5331-pack-objects-stdin.sh index cd949025b9..c3bbc76b0d 100755 --- a/t/t5331-pack-objects-stdin.sh +++ b/t/t5331-pack-objects-stdin.sh @@ -358,6 +358,24 @@ test_expect_success '--stdin-packs with promisors' ' ) ' +test_expect_success '--stdin-packs does not perform backfill fetch' ' + test_when_finished "rm -rf remote client" && + + git init remote && + test_commit_bulk -C remote 10 && + git -C remote config set --local uploadpack.allowfilter 1 && + git -C remote config set --local uploadpack.allowanysha1inwant 1 && + + git clone --filter=tree:0 "file://$(pwd)/remote" client && + ( + cd client && + ls .git/objects/pack/*.promisor | sed "s|.*/||; s/\.promisor$/.pack/" >packs && + test_line_count -gt 1 packs && + GIT_TRACE2_EVENT="$(pwd)/event.log" git pack-objects --stdin-packs pack