Commit Graph

24342 Commits

Author SHA1 Message Date
Derrick Stolee
2ef539bcee for-each-repo: work correctly in a worktree
When run in a worktree, the GIT_DIR directory is set in a different way
than in a typical repository. Show this by updating t0068 to include a
worktree and add a test that runs from that worktree. This requires
moving the repo.key config into a global config instead of the base test
repository's local config (demonstrating that it worked with
non-worktree Git repositories).

We need to be careful to unset the local Git environment variables and
let the child process rediscover them, while also reinstating those
variables in the parent process afterwards. Update run_command_on_repo()
to use the new sanitize_repo_env() helper method to erase these
environment variables.

During review of this bug fix, there were several incorrect patches
demonstrating different bad behaviors. Most of these are covered by
tests, when it is not too expensive to set it up. One case that would be
expensive to set up is the GIT_NO_REPLACE_OBJECTS environment variable,
but we trust that using sanitize_repo_env() will be sufficient to
capture these uncovered cases by using the common code for resetting
environment variables.

Reported-by: Matthew Gabeler-Lee <fastcat@gmail.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-03 10:20:00 -08:00
Derrick Stolee
c5e62e1aa0 for-each-repo: test outside of repo context
The 'git for-each-repo' tool is frequently run outside of a repo context
in the real world. For example, it powers background maintenance.
Despite this typical case, we have not been testing it without a local
repository.

Update t0068 to stop creating a test repo and to use global config
everywhere. This has some subtle changes to test across the file.

This was noticed because an earlier attempt to remove the_repository
from builtin/for-each-repo.c did not catch a segmentation fault since
the passed 'repo' is NULL. This use of the_repository will need to stay
until we have a better way to handle config queries outside of a repo
context. Similar use still exists in builtin/config.c for the same
reason.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-03 10:19:59 -08:00
Junio C Hamano
1f047a6fba Merge branch 'js/ci-leak-skip-svn'
Dscho observed that SVN tests are taking too much time in CI leak
checking tasks, but most time is spent not in our code but in libsvn
code (which happen to be written in Perl), whose leaks have little
value to discover for us.  Skip SVN, P4, and CVS tests in the leak
checking tasks.

* js/ci-leak-skip-svn:
  ci: skip CVS and P4 tests in leaks job, too
  ci(*-leaks): skip the git-svn tests to save time
2026-01-23 13:34:36 -08:00
Junio C Hamano
62627a3484 Merge branch 'ty/t1005-test-path-is-helpers'
Test clean-up.

* ty/t1005-test-path-is-helpers:
  t1005: modernize "! test -f" to "test_path_is_missing"
2026-01-23 13:34:36 -08:00
Junio C Hamano
cd8b8cba47 Merge branch 'rj/cygwin-test-fixes-for-2.53'
Test fixup.

* rj/cygwin-test-fixes-for-2.53:
  t0610-reftable-basics: mitigate a flaky test on cygwin
  t9700/test.pl: fix path type expectation on cygwin
2026-01-23 13:34:36 -08:00
Junio C Hamano
214cbb7b1d Merge branch 'rs/tree-wo-the-repository'
Remove implicit reliance on the_repository global in the APIs
around tree objects and make it explicit which repository to work
in.

* rs/tree-wo-the-repository:
  cocci: remove obsolete the_repository rules
  cocci: convert parse_tree functions to repo_ variants
  tree: stop using the_repository
  tree: use repo_parse_tree()
  path-walk: use repo_parse_tree_gently()
  pack-bitmap-write: use repo_parse_tree()
  delta-islands: use repo_parse_tree()
  bloom: use repo_parse_tree()
  add-interactive: use repo_parse_tree_indirect()
  tree: add repo_parse_tree*()
  environment: move access to core.maxTreeDepth into repo settings
2026-01-21 16:16:28 -08:00
Junio C Hamano
e13de9ed8e Merge branch 'tb/midx-write-corrupt-checksum-fix'
The logic that avoids reusing MIDX files with a wrong checksum was
broken, which has been corrected.

* tb/midx-write-corrupt-checksum-fix:
  midx-write.c: assume checksum-invalid MIDXs require an update
  t/t5319-multi-pack-index.sh: drop early 'test_done'
2026-01-21 16:16:27 -08:00
Junio C Hamano
070fa41675 Merge branch 'ps/geometric-repacking-with-promisor-remotes'
"git repack --geometric" did not work with promisor packs, which
has been corrected.

* ps/geometric-repacking-with-promisor-remotes:
  builtin/repack: handle promisor packs with geometric repacking
  repack-promisor: extract function to remove redundant packs
  repack-promisor: extract function to finalize repacking
  repack-geometry: extract function to compute repacking split
  builtin/pack-objects: exclude promisor objects with "--stdin-packs"
2026-01-21 16:16:27 -08:00
Junio C Hamano
bc5cbbe246 Merge branch 'ps/read-object-info-improvements'
The object-info API has been cleaned up.

* ps/read-object-info-improvements:
  packfile: drop repository parameter from `packed_object_info()`
  packfile: skip unpacking object header for disk size requests
  packfile: disentangle return value of `packed_object_info()`
  packfile: always populate pack-specific info when reading object info
  packfile: extend `is_delta` field to allow for "unknown" state
  packfile: always declare object info to be OI_PACKED
  object-file: always set OI_LOOSE when reading object info
2026-01-21 08:29:00 -08:00
Junio C Hamano
e01178bb1a Merge branch 'ps/t1410-cleanup'
Test clean-up.

* ps/t1410-cleanup:
  t1410: use test helpers in reflog rewind test
2026-01-21 08:28:58 -08:00
Junio C Hamano
dc861c97c3 Merge branch 'ps/ref-consistency-checks'
Update code paths that check data integrity around refs subsystem.
cf. <CAOLa=ZShPP3BPXa=YnC-vuX4zF=pUTFdUidZwOdna8bfVTNM9w@mail.gmail.com>

* ps/ref-consistency-checks:
  builtin/fsck: drop `fsck_head_link()`
  builtin/fsck: move generic HEAD check into `refs_fsck()`
  builtin/fsck: move generic object ID checks into `refs_fsck()`
  refs/reftable: introduce generic checks for refs
  refs/reftable: fix consistency checks with worktrees
  refs/reftable: extract function to retrieve backend for worktree
  refs/reftable: adapt includes to become consistent
  refs/files: introduce function to perform normal ref checks
  refs/files: extract generic symref target checks
  fsck: drop unused fields from `struct fsck_ref_report`
  refs/files: perform consistency checks for root refs
  refs/files: improve error handling when verifying symrefs
  refs/files: extract function to check single ref
  refs/files: remove useless indirection
  refs/files: remove `refs_check_dir` parameter
  refs/files: move fsck functions into global scope
  refs/files: simplify iterating through root refs
2026-01-21 08:28:58 -08:00
Junio C Hamano
047bd7dfe3 ci: skip CVS and P4 tests in leaks job, too
Looking at the CI logs, the p4 and cvs tests account for another 24
minutes of test time and they offer minimal value for quite a
similar reason as the previous step.

Let's introduce and use a mechanism to skip these tests to save
some resources.

Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-19 11:02:31 -08:00
Tian Yuchen
bc8556d066 t1005: modernize "! test -f" to "test_path_is_missing"
Replace instances of "! test -f <file>" with "test_path_is_missing <file>".
This macro provides better diagnostics when the test fails (it prints
"Path exists:" instead of silently failing).

Signed-off-by: Tian Yuchen <a3205153416@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-17 10:01:14 -08:00
Ramsay Jones
59da9f292a t0610-reftable-basics: mitigate a flaky test on cygwin
Test #29 ('ref transaction: corrupted tables cause failure') started to
fail intermittently for me (from v2.52.0-rc0) when running the testsuite
with '-j8'. (Also, having moved to a new laptop and windows 11, rather
than windows 10). If the test is run by hand, or without any parallelism,
then it passes without issue.

When the test fails (e.g. 1 out of 32 parallel runs) the cause is due to
a permission error while corrupting a table file:

  ./test-lib.sh: line 1010: .git/reftable/0x000000000001-0x000000000002-d89bb8ee.ref: Permission denied

This corruption is done in a shell loop, directly after a 'test_commit',
which uses an ': >"$f"' expression to truncate the file. Adding a sleep
of one second after the 'test_commit' and before the shell loop fixes
the test (it is not clear why). Replacing the redirection shell expression
with a 'test-tool truncate "$f" 0' invocation also provides a fix, which
could simply be another way to change the timing sufficiently to win the
race.

During a debug session, I tried looking at the strace output for the
shell redirection:

  $ rm /tmp/hello; echo hello >/tmp/hello; ls -l /tmp/hello
  -rw-r--r-- 1 ramsay None 6 Nov 10 17:25 /tmp/hello
  $

  $ strace -o zzz bash -c ': >/tmp/hello'
  $

Similarly, for the test-tool solution:

  $ strace -o xxx ./t/helper/test-tool truncate /tmp/hello 0
  $

When comparing the output, the differences seemed to be what you would
expect and, if anything, the shell redirect probably would have taken
longer than the test-tool solution (many fcntl() calls to dup the stdout
to the <fd>).  The call to the win32 api NtCreateFile() was identical,
apart from the first (FileHandle) parameter, of course.

In order to fix this flaky test on cygwin, despite not knowing why it
works, replace the shell redirection with the above 'test-tool truncate'
invocation.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-16 14:01:42 -08:00
Ramsay Jones
c381a21490 t9700/test.pl: fix path type expectation on cygwin
Commit 4ec7ac101b ("t9700: accommodate for Windows paths", 2025-12-17)
changed the type of the absolute path to the git directory from unix to
win32 for both GfW and cygwin. This fixed the test for GfW but causes
new failures on cygwin, since the test expectation is that it uses unix
paths on cygwin. In order to not break cygwin, disable the new code by
removing the "or $^O eq 'cygwin'" sub-expression from the conditional
part of the fix.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-16 13:59:57 -08:00
Junio C Hamano
a7bbe907e3 Merge branch 'kj/t7101-modernize'
Test update.

* kj/t7101-modernize:
  t7101: modernize test path checks
2026-01-16 12:40:27 -08:00
Junio C Hamano
5058dc1ec1 Merge branch 'ac/t1420-use-more-direct-check'
Test update.

* ac/t1420-use-more-direct-check:
  t1420: modernize the lost-found test
2026-01-16 12:40:27 -08:00
Junio C Hamano
8c5f0adf21 Merge branch 'jk/cat-file-avoid-bitmap-when-unneeded'
Fix for a performance regression in "git cat-file".

* jk/cat-file-avoid-bitmap-when-unneeded:
  cat-file: only use bitmaps when filtering
2026-01-16 12:40:27 -08:00
Junio C Hamano
18d7d02088 Merge branch 'jk/t-perf-fixes'
Perf-test fixes.

* jk/t-perf-fixes:
  t/perf/run: preserve GIT_PERF_* from environment
  t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY
2026-01-16 12:40:26 -08:00
Junio C Hamano
a3d1f391d3 Revert "Merge branch 'ar/run-command-hook'"
This reverts commit f406b89552,
reversing changes made to 1627809eef.

It seems to have caused a few regressions, two of the three known
ones we have proposed solutions for.  Let's give ourselves a bit
more room to maneuver during the pre-release freeze period and
restart once the 2.53 ships.
2026-01-15 13:02:38 -08:00
Junio C Hamano
87e278d837 Merge branch 'ps/clar-integers'
Import newer version of "clar", unit testing framework.

* ps/clar-integers:
  gitattributes: disable blank-at-eof errors for clar test expectations
  t/unit-tests: demonstrate use of integer comparison assertions
  t/unit-tests: update clar to 39f11fe
2026-01-15 07:12:41 -08:00
Junio C Hamano
e7b120c357 Merge branch 'kh/replay-invalid-onto-advance'
Improve the error message when a bad argument is given to the
`--onto` option of "git replay".  Test coverage of "git replay" has
been improved.

* kh/replay-invalid-onto-advance:
  t3650: add more regression tests for failure conditions
  replay: die if we cannot parse object
  replay: improve code comment and die message
  replay: die descriptively when invalid commit-ish is given
  replay: find *onto only after testing for ref name
  replay: remove dead code and rearrange
2026-01-15 07:12:41 -08:00
Junio C Hamano
24c43fb10b Merge branch 'ps/odb-misc-fixes'
Miscellaneous fixes on object database layer.

* ps/odb-misc-fixes:
  odb: properly close sources before freeing them
  builtin/gc: fix condition for whether to write commit graphs
2026-01-15 07:12:41 -08:00
Junio C Hamano
c0453835ab Merge branch 'pt/t7800-difftool-test-racefix'
Test fixup.

* pt/t7800-difftool-test-racefix:
  t7800: fix racy "difftool --dir-diff syncs worktree" test
2026-01-15 07:12:40 -08:00
Patrick Steinhardt
dcc9c7ef47 builtin/repack: handle promisor packs with geometric repacking
When performing a fetch with an object filter, we mark the resulting
packfile as a promisor pack. An object part of such a pack may miss any
of its referenced objects, and Git knows to handle this case by fetching
any such missing objects from the promisor remote.

The "promisor" property needs to be retained going forward. So every
time we pack a promisor object, the resulting pack must be marked as a
promisor pack. git-repack(1) does this already: when a repository has a
promisor remote, it knows to pass "--exclude-promisor-objects" to the
git-pack-objects(1) child process. Promisor packs are written separately
when doing an all-into-one repack via `repack_promisor_objects()`.

But we don't support promisor objects when doing a geometric repack yet.
Promisor packs do not get any special treatment there, as we simply
merge promisor and non-promisor packs. The resulting pack is not even
marked as a promisor pack, which essentially corrupts the repository.

This corruption couldn't happen in the real world though: we pass both
"--exclude-promisor-objects" and "--stdin-packs" to git-pack-objects(1)
if a repository has a promisor remote, but as those options are mutually
exclusive we always end up dying. And while we made those flags
compatible with one another in a preceding commit, we still end up dying
in case git-pack-objects(1) is asked to repack a promisor pack.

There's multiple ways to fix this:

  - We can exclude promisor packs from the geometric progression
    altogether. This would have the consequence that we never repack
    promisor packs at all. But in a partial clone it is quite likely
    that the user generates a bunch of promisor packs over time, as
    every backfill fetch would create another one. So this doesn't
    really feel like a sensible option.

  - We can adapt git-pack-objects(1) to support repacking promisor packs
    and include them in the normal geometric progression. But this would
    mean that the set of promisor objects expands over time as the packs
    are merged with normal packs.

  - We can use a separate geometric progression to repack promisor
    packs.

The first two options both have significant downsides, so they aren't
really feasible. But the third option fixes both of these downsides: we
make sure that promisor packs get merged, and at the same time we never
expand the set of promisor objects beyond the set of objects that are
already marked as promisor objects.

Implement this strategy so that geometric repacking works in partial
clones.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-14 06:29:24 -08:00
Patrick Steinhardt
0cd306ebc8 builtin/pack-objects: exclude promisor objects with "--stdin-packs"
It is currently not possible to combine "--exclude-promisor-objects"
with "--stdin-packs" because both flags want to set up a revision walk
to enumerate the objects to pack. In a subsequent commit though we want
to extend geometric repacks to support promisor objects, and for that we
need to handle the combination of both flags.

There are two cases we have to think about here:

  - "--stdin-packs" asks us to pack exactly the objects part of the
    specified packfiles. It is somewhat questionable what to do in the
    case where the user asks us to exclude promisor objects, but at the
    same time explicitly passes a promisor pack to us. For now, we
    simply abort the request as it is self-contradicting. As we have
    also been dying before this commit there is no regression here.

  - "--stdin-packs=follow" does the same as the first flag, but it also
    asks us to include all objects transitively reachable from any
    object in the packs we are about to repack. This is done by doing
    the revision walk mentioned further up. Luckily, fixing this case is
    trivial: we only need to modify the revision walk to also set the
    `exclude_promisor_objects` field.

Note that we do not support the "--exclude-promisor-objects-best-effort"
flag for now as we don't need it to support geometric repacking with
promisor objects.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-14 06:29:24 -08:00
Taylor Blau
38b72e5815 midx-write.c: assume checksum-invalid MIDXs require an update
In 6ce9d558ce (midx-write: skip rewriting MIDX with `--stdin-packs`
unless needed, 2025-12-10), the MIDX machinery learned how to optimize
out unnecessary writes with "--stdin-packs".

In order to do this, it compares the contents of the in-progress write
against a MIDX loaded directly from the object store. We load a separate
MIDX (as opposed to checking our update relative to "ctx.m") because the
MIDX code does not reuse an existing MIDX with --stdin-packs, and always
leaves "ctx.m" as NULL. See commit 0c5a62f14b (midx-write.c: do not
read existing MIDX with `packs_to_include`, 2024-06-11) for details on
why.

If "ctx.m" is non-NULL, however, it is guaranteed to be checksum-valid,
since we only assign "ctx.m" when "midx_checksum_valid()" returns true.
Since the same guard does not exist for the MIDX we pass to
"midx_needs_update()", we may ignore on-disk corruption when determining
whether or not we can optimize out the write.

Add a similar guard within "midx_needs_update()" to prevent such an
issue.

A more robust fix would involve revising 0c5a62f14b and teaching the
MIDX generation code how to reuse an existing MIDX even when invoked
with "--stdin-packs", such that we could avoid side-loading the MIDX
directly from the object store in order to call "midx_needs_update()".
For now, pursue the minimal fix.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-13 05:21:34 -08:00
Taylor Blau
e16ac6ca0d t/t5319-multi-pack-index.sh: drop early 'test_done'
In 6ce9d558ce (midx-write: skip rewriting MIDX with `--stdin-packs`
unless needed, 2025-12-10), an extra 'test_done' was added, causing the
test script to finish before having run all of its tests.

Dropping this extraneous 'test_done' exposes a bug from commit
6ce9d558ce that causes a subsequent test to fail. Mark that test with a
'test_expect_failure' for now, and the subsequent commit will explain
and fix the bug.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-13 05:21:34 -08:00
Patrick Steinhardt
9727336b31 builtin/fsck: move generic HEAD check into refs_fsck()
Move the check that detects "HEAD" refs that do not point at a branch
into `refs_fsck()`. This follows the same motivation as the preceding
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 06:55:41 -08:00
Patrick Steinhardt
46d611cada builtin/fsck: move generic object ID checks into refs_fsck()
While most of the logic that verifies the consistency of refs is
driven by `refs_fsck()`, we still have a small handful of checks in
`fsck_head_link()`. These checks don't use the git-fsck(1) reporting
infrastructure, and as such it's impossible to for example disable
some of those checks.

One such check detects refs that point to the all-zeroes object ID.
Extract this check into the generic `refs_fsck_ref()` function that is
used by both the "files" and "reftable" backends.

Note that this will cause us to not return an error code from
`fsck_head_link()` anymore in case this error was detected. This is fine
though: the only caller of this function does not check the error code
anyway. To demonstrate this, adapt the function to drop its return value
altogether. The function will be removed in a subsequent commit anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 06:55:41 -08:00
Patrick Steinhardt
06d6ead762 refs/reftable: introduce generic checks for refs
In a preceding commit we have extracted generic checks for both direct
and symbolic refs that apply for all backends. Wire up those checks for
the "reftable" backend.

Note that this is done by iterating through all refs manually with the
low-level reftable ref iterator. We explicitly don't want to use the
higher-level iterator that is exposed to users of the reftable backend
as that iterator may swallow for example broken refs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 06:55:41 -08:00
Patrick Steinhardt
9341740bea refs/reftable: fix consistency checks with worktrees
The ref consistency checks are driven via `cmd_refs_verify()`. That
function loops through all worktrees (including the main worktree) and
then checks the ref store for each of them individually. It follows that
the backend is expected to only verify refs that belong to the specified
worktree.

While the "files" backend handles this correctly, the "reftable" backend
doesn't. In fact, it completely ignores the passed worktree and instead
verifies refs of _all_ worktrees. The consequence is that we'll end up
every ref store N times, where N is the number of worktrees.

Or rather, that would be the case if we actually iterated through the
worktree reftable stacks correctly. But we use `strmap_for_each_entry()`
to iterate through the stacks, but the map is in fact not even properly
populated. So instead of checking stacks N^2 times, we actually only end
up checking the reftable stack of the main worktree.

Fix this bug by only verifying the stack of the passed-in worktree and
constructing the backends via `backend_for_worktree()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 06:55:41 -08:00
Patrick Steinhardt
7b8c36a2a7 refs/files: perform consistency checks for root refs
While the "files" backend already knows to perform consistency checks
for the "refs/" hierarchy, it doesn't verify any of its root refs. Plug
this omission.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 06:55:40 -08:00
Patrick Steinhardt
7a4bd1b836 packfile: always declare object info to be OI_PACKED
When reading object info via a packfile we yield one of two types:

  - The object can either be OI_PACKED, which is what a caller would
    typically expect.

  - Or it can be OI_DBCACHED if it is stored in the delta base cache.

The latter really is an implementation detail though, and callers
typically don't care at all about the difference. Furthermore, the
information whether or not it is part of the delta base cache can
already be derived via the `is_delta` field, so the fact that we discern
between OI_PACKED and OI_DBCACHED only further complicates the
interface.

There aren't all that many callers that care about the `whence` field in
the first place. In fact, there's only three:

  - `packfile_store_read_object_info()` checks for `whence == OI_PACKED`
    and then populates the packfile information of the object info
    structure. We now start to do this also for deltified objects, which
    gives its callers strictly more information.

  - `repack_local_links()` wants to determine whether the object is part
    of a promisor pack and checks for `whence == OI_PACKED`. If so, it
    verifies that the packfile is a promisor pack. It's arguably wrong
    to declare that an object is not part of a promisor pack only
    because it is stored in the delta base cache.

  - `is_not_in_promisor_pack_obj()` does the same, but checks that a
    specific object is _not_ part of a promisor pack. The same reasoning
    as above applies.

Drop the OI_DBCACHED enum completely. None of the callers seem to care
about the distinction.

Note that this also fixes a segfault introduced in 8c1b84bc97
(streaming: move logic to read packed objects streams into backend,
2025-11-23), which refactors how we stream packed objects. The intent is
to only read packed objects in case they are stored non-deltified as
we'd otherwise have to deflate them first. But the check for whether or
not the object is stored as a delta was unconditionally done via
`oi.u.packed.is_delta`, which is only valid in case `oi.whence` is
`OI_PACKED`. But under some circumstances we got `OI_DBCACHED` here,
which means that none of the `oi.u.packed` fields were initialized at
all. Consequently, we assumed the object was not stored as a delta, and
then try to read the object from `oi.u.packed.pack`, which is a `NULL`
pointer and thus causes a segfault.

Add a test case for this issue so that this cannot regress in the
future anymore.

Reported-by: Matt Smiley <msmiley@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 06:51:14 -08:00
Junio C Hamano
6506be0304 Merge branch 'ps/t1300-2021-use-test-path-is-helpers'
Test updates.

* ps/t1300-2021-use-test-path-is-helpers:
  t1300: use test helpers instead of `test` command
2026-01-12 05:19:52 -08:00
Junio C Hamano
3235ef374e Merge branch 'rs/commit-stack'
Code clean-up, unifying various hand-rolled "list of commit
objects" and use the commit_stack API.

* rs/commit-stack:
  commit-reach: use commit_stack
  commit-graph: use commit_stack
  commit: add commit_stack_grow()
  shallow: use commit_stack
  pack-bitmap-write: use commit_stack
  commit: add commit_stack_init()
  test-reach: use commit_stack
  remote: use commit_stack for src_commits
  remote: use commit_stack for sent_tips
  remote: use commit_stack for local_commits
  name-rev: use commit_stack
  midx: use commit_stack
  log: use commit_stack
  revision: export commit_stack
2026-01-12 05:19:52 -08:00
Junio C Hamano
0320bcd743 Merge branch 'sb/bundle-uri-without-uri'
Diagnose invalid bundle-URI that lack the URI entry, instead of
crashing.

* sb/bundle-uri-without-uri:
  bundle-uri: validate that bundle entries have a uri
2026-01-12 05:19:52 -08:00
Junio C Hamano
6a4b4e7880 Merge branch 'ja/doc-synopsis-style-more'
More doc style updates.

* ja/doc-synopsis-style-more:
  doc: convert git-remote to synopsis style
  doc: convert git stage to use synopsis block
  doc: convert git-status tables to AsciiDoc format
  doc: convert git-status to synopsis style
  doc: fix t0450-txt-doc-vs-help to select only first synopsis block
2026-01-12 05:19:52 -08:00
Pushkar Singh
220f888d7e t1410: use test helpers in reflog rewind test
Replace raw `test -f` and `! test -f` checks in the rewind test with
`test_path_is_file` and `test_path_is_missing`. This provides clearer
failure diagnostics and keeps the test consistent with the rest of
the test suite.

Signed-off-by: Pushkar Singh <pushkarkumarsingh1970@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-11 20:55:05 -08:00
René Scharfe
ec7a16b145 cocci: convert parse_tree functions to repo_ variants
Add and apply a semantic patch to convert calls to parse_tree() and
friends to the corresponding variant that takes a repository argument,
to allow the functions that implicitly use the_repository to be retired
once all potential in-flight topics are settled and converted as well.

The changes in .c files were generated by Coccinelle, but I fixed a
whitespace bug it would have introduced to builtin/commit.c.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 18:36:18 -08:00
K Jayatheerth
dbbf6a901b t7101: modernize test path checks
Replace old-style `test -[df]` and `! test -[df]` assertions with
the modern `test_path_is_file`, `test_path_is_dir`, and
`test_path_is_missing` helpers.

These helpers provide more informative error messages in case of
failure (e.g., "File 'foo' is missing" instead of just exit code 1).

While at it, fix a typo and an incorrect path
reference in one of the test descriptions.

Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:36:07 -08:00
Junio C Hamano
2db806d817 Merge branch 'en/ort-recursive-d-f-conflict-fix'
The ort merge machinery hit an assertion failure in a history with
criss-cross merges renamed a directory and a non-directory, which
has been corrected.

* en/ort-recursive-d-f-conflict-fix:
  merge-ort: fix corner case recursive submodule/directory conflict handling
2026-01-08 16:40:12 +09:00
Junio C Hamano
512351f2a8 Merge branch 'dd/t5403-modernise'
Test micro-clean-up.

* dd/t5403-modernise:
  t5403: use test_path_is_file instead of test -f
2026-01-08 16:40:12 +09:00
Junio C Hamano
c0754dc423 Merge branch 'ds/diff-lazy-fetch-with-name-only-fix'
Running "git diff" with "--name-only" and other options that allows
us not to look at the blob contents, while objects that are lazily
fetched from a promisor remote, caused use-after-free, which has
been corrected.

* ds/diff-lazy-fetch-with-name-only-fix:
  diff: avoid segfault with freed entries
2026-01-08 16:40:11 +09:00
Andrew Chitester
6c5c7e7071 t1420: modernize the lost-found test
This test indirectly checks that the lost-found folder has 2 files in it
and then checks that the expected two files exist. Make this more
deliberate by removing the old test -f and compare the actual ls of the
lost-found directory with the expected files.

Signed-off-by: Andrew Chitester <andchi@fastmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:19:41 +09:00
Patrick Steinhardt
b3449b1517 builtin/gc: fix condition for whether to write commit graphs
When performing auto-maintenance we check whether commit graphs need to
be generated by counting the number of commits that are reachable by any
reference, but not covered by a commit graph. This search is performed
by iterating through all references and then doing a depth-first search
until we have found enough commits that are not present in the commit
graph.

This logic has a memory leak though:

  Direct leak of 16 byte(s) in 1 object(s) allocated from:
      #0 0x55555562e433 in malloc (git+0xda433)
      #1 0x555555964322 in do_xmalloc ../wrapper.c:55:8
      #2 0x5555559642e6 in xmalloc ../wrapper.c:76:9
      #3 0x55555579bf29 in commit_list_append ../commit.c:1872:35
      #4 0x55555569f160 in dfs_on_ref ../builtin/gc.c:1165:4
      #5 0x5555558c33fd in do_for_each_ref_iterator ../refs/iterator.c:431:12
      #6 0x5555558af520 in do_for_each_ref ../refs.c:1828:9
      #7 0x5555558ac317 in refs_for_each_ref ../refs.c:1833:9
      #8 0x55555569e207 in should_write_commit_graph ../builtin/gc.c:1188:11
      #9 0x55555569c915 in maintenance_is_needed ../builtin/gc.c:3492:8
      #10 0x55555569b76a in cmd_maintenance ../builtin/gc.c:3542:9
      #11 0x55555575166a in run_builtin ../git.c:506:11
      #12 0x5555557502f0 in handle_builtin ../git.c:779:9
      #13 0x555555751127 in run_argv ../git.c:862:4
      #14 0x55555575007b in cmd_main ../git.c:984:19
      #15 0x5555557523aa in main ../common-main.c:9:11
      #16 0x7ffff7a2a4d7 in __libc_start_call_main (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a4d7) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #17 0x7ffff7a2a59a in __libc_start_main@GLIBC_2.2.5 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a59a) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #18 0x5555555f0934 in _start (git+0x9c934)

The root cause of this memory leak is our use of `commit_list_append()`.
This function expects as parameters the item to append and the _tail_ of
the list to append. This tail will then be overwritten with the new tail
of the list so that it can be used in subsequent calls. But we call it
with `commit_list_append(parent->item, &stack)`, so we end up losing
everything but the new item.

This issue only surfaces when counting merge commits. Next to being a
memory leak, it also shows that we're in fact miscounting as we only
respect children of the last parent. All previous parents are discarded,
so their children will be disregarded unless they are hit via another
reference.

While crafting a test case for the issue I was puzzled that I couldn't
establish the proper border at which the auto-condition would be
fulfilled. As it turns out, there's another bug: if an object is at the
tip of any reference we don't mark it as seen. Consequently, if it is
the tip of or reachable via another ref, we'd count that object multiple
times.

Fix both of these bugs so that we properly count objects without leaking
any memory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:16:50 +09:00
Jeff King
9e8b448dd8 cat-file: only use bitmaps when filtering
Commit 8002e8ee18 (builtin/cat-file: use bitmaps to efficiently filter
by object type, 2025-04-02) introduced a performance regression when we
are not filtering objects: it uses bitmaps even when they won't help,
incurring extra costs. For example, running the new perf tests from this
commit, which check the performance of listing objects by oid:

  $ export GIT_PERF_LARGE_REPO=/path/to/linux.git
  $ git -C "$GIT_PERF_LARGE_REPO" repack -adb
  $ GIT_SKIP_TESTS=p1006.1 ./run 8002e8ee18^ 8002e8ee18 p1006-cat-file.sh
  [...]
  Test                                  8002e8ee18^       8002e8ee18
  -------------------------------------------------------------------------------
  1006.2: list all objects (sorted)     1.48(1.44+0.04)   6.39(6.35+0.04) +331.8%
  1006.3: list all objects (unsorted)   3.01(2.97+0.04)   3.40(3.29+0.10) +13.0%
  1006.4: list blobs                    4.85(4.67+0.17)   1.68(1.58+0.10) -65.4%

An invocation that filters, like listing all blobs (1006.4), does
benefit from using the bitmaps; it now doesn't have to check the type of
each object from the pack data, so the tradeoff is worth it.

But for listing all objects in sorted idx order (1006.2), we otherwise
would never open the bitmap nor the revindex file. Worse, our sorting
step gets much worse. Normally we append into an array in pack .idx
order, and the sort step is trivial. But with bitmaps, we get the
objects in pack order, which is apparently random with respect to oid,
and have to sort the whole thing. (Note that this freshly-packed state
represents the best case for .idx sorting; if we had two packs, then
we'd have their objects one after the other and qsort would have to
interleave them).

The unsorted test in 1006.3 is interesting: there we are going in pack
order, so we load the revindex for the pack anyway. And though we don't
sort the result, we do use an oidset to check for duplicates. So we can
see in the 8002e8ee18^ timings that those two things cost ~1.5s over the
sorted case (mostly the oidset hash cost). We also incur the extra cost
to open the bitmap file as of 8002e8ee18, which seems to be ~400ms.
(This would probably be faster with a bitmap lookup table, but writing
that out is not yet the default).

So we know that bitmaps help when there's filtering to be done, but
otherwise make things worse. Let's only use them when there's a filter.

The perf script shows that we've fixed the regressions without hurting
the bitmap case:

  Test                                  8002e8ee18^       8002e8ee18                HEAD
  --------------------------------------------------------------------------------------------------------
  1006.2: list all objects (sorted)     1.56(1.53+0.03)   6.44(6.37+0.06) +312.8%   1.62(1.54+0.06) +3.8%
  1006.3: list all objects (unsorted)   3.04(2.98+0.06)   3.45(3.38+0.07) +13.5%    3.04(2.99+0.04) +0.0%
  1006.4: list blobs                    5.14(4.98+0.15)   1.76(1.68+0.06) -65.8%    1.73(1.64+0.09) -66.3%

Note that there's another related case: we might have a filter that
cannot be used with bitmaps. That check is handled already for us in
for_each_bitmapped_object(), though we'd still load the bitmap and
revindex files pointlessly in that case. I don't think it can happen in
practice for cat-file, though, since it allows only blob:none,
blob:limit, and object:type filters, all of which work with bitmaps.

It would be easy-ish to insert an extra check like:

  can_filter_bitmap(&opt->objects_filter);

into the conditional, but I didn't bother here. It would be redundant
with the call in for_each_bitmapped_object(), and the can_filter helper
function is static local in the bitmap code (so we'd have to make it
public).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:05:12 +09:00
Jeff King
79d301c767 t/perf/run: preserve GIT_PERF_* from environment
If you run:

  GIT_PERF_LARGE_REPO=/some/path ./p1006-cat-file.sh

it will use the repo in /some/path. But if you use the "run" helper
script to aggregate and compare results, like this:

  GIT_PERF_LARGE_REPO=/some/path ./run HEAD^ HEAD p1006-cat-file.sh

it will ignore that variable. This is because the presence of the
LARGE_REPO variable in GIT-BUILD-OPTIONS overrides what's in the
environment. This started with 4638e8806e (Makefile: use common template
for GIT-BUILD-OPTIONS, 2024-12-06), which now writes even empty
variables (though arguably it was wrong even before with a non-empty
value, as we generally prefer the environment to take precedence over
on-disk config).

We had the same problem in perf-lib.sh itself, and we hacked around it
with 32b74b9809 (perf: do allow `GIT_PERF_*` to be overridden again,
2025-04-04). That's what lets the direct invocation of "./p1006" work
above.

And in fact that was sufficient for "./run", too, until it started
loading GIT-BUILD-OPTIONS itself in 5756ccd181 (t/perf: fix benchmarks
with out-of-tree builds, 2025-04-28). Now it has the same problem: it
clobbers any incoming GIT_PERF options from the environment.

We can use the same hack here in the "run" script. It's quite ugly, but
it's just short enough that I don't think it's worth trying to factor it
out into a common shell library.

In the long run, we might consider teaching GIT-BUILD-OPTIONS to be more
gentle in overwriting existing entries. There are probably other
GIT_TEST_* variables which would need the same treatment. And if and
when we come up with a more complete solution, we can use it in both
spots.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 08:56:36 +09:00
Jeff King
aad1d1c0d5 t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY
Using the perf suite's "run" helper in a vanilla build fails like this:

  $ make && (cd t/perf && ./run p0000-perf-lib-sanity.sh)
  === Running 1 tests in this tree ===
  perf 1 - test_perf_default_repo works: 1 2 3 ok
  perf 2 - test_checkout_worktree works: 1 2 3 ok
  ok 3 - test_export works
  perf 4 - export a weird var: 1 2 3 ok
  perf 5 - éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś: 1 2 3 ok
  ok 6 - test_export works with weird vars
  perf 7 - important variables available in subshells: 1 2 3 ok
  perf 8 - test-lib-functions correctly loaded in subshells: 1 2 3 ok
  # passed all 8 test(s)
  1..8
  cannot open test-results/p0000-perf-lib-sanity.subtests: No such file or directory at ./aggregate.perl line 159.

It is trying to aggregate results written into t/perf/test-results, but
the p0000 script did not write anything there.

The "run" script looks in $TEST_OUTPUT_DIRECTORY/test-results, or if
that variable is not set, in test-results in the current working
directory (which should be t/perf itself). It pulls the value of
$TEST_OUTPUT_DIRECTORY from the GIT-BUILD-OPTIONS file.

But that doesn't quite match the setup in perf-lib.sh (which is what
scripts like p0000 use). There we do this at the top of the script:

  TEST_OUTPUT_DIRECTORY=$(pwd)

and then let test-lib.sh append "/test-results" to that. Historically,
that made the vanilla case work: we'd always use t/perf/test-results.
But when $TEST_OUTPUT_DIRECTORY was set, it would break.

Commit 5756ccd181 (t/perf: fix benchmarks with out-of-tree builds,
2025-04-28) fixed that second case by loading GIT-BUILD-OPTIONS
ourselves. But that broke the vanilla case!

Now our setting of $TEST_OUTPUT_DIRECTORY in perf-lib.sh is ignored,
because it is overwritten by GIT-BUILD-OPTIONS. And when test-lib.sh
sees that the output directory is empty, it defaults to t/test-results,
rather than t/perf/test-results.

Nobody seems to have noticed, probably for two reasons:

  1. It only matters if you're trying to aggregate results (like the
     "run" script does). Just running "./p0000-perf-lib-sanity.sh"
     manually still produces useful output; the stored result files are
     just in an unexpected place.

  2. There might be leftover files in t/perf/test-results from previous
     runs (before 5756ccd181). In particular, the ".subtests" files
     don't tend to change, and the lack of that file is what causes it
     to barf completely. So it's possible that the aggregation could
     have been showing stale results that did not match the run that
     just happened.

We can fix it by setting TEST_OUTPUT_DIRECTORY only after we've loaded
GIT-BUILD-OPTIONS, so that we override its value and not the other way
around. And we'll do so only when the variable is not set, which should
retain the fix for that case from 5756ccd181.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 08:56:36 +09:00
Junio C Hamano
f406b89552 Merge branch 'ar/run-command-hook'
Use hook API to replace ad-hoc invocation of hook scripts with the
run_command() API.

* ar/run-command-hook:
  receive-pack: convert receive hooks to hook API
  receive-pack: convert update hooks to new API
  hooks: allow callers to capture output
  run-command: allow capturing of collated output
  hook: allow overriding the ungroup option
  reference-transaction: use hook API instead of run-command
  transport: convert pre-push to hook API
  hook: convert 'post-rewrite' hook in sequencer.c to hook API
  hook: provide stdin via callback
  run-command: add stdin callback for parallelization
  run-command: add first helper for pp child states
2026-01-06 16:33:53 +09:00