79367 Commits

Author SHA1 Message Date
Junio C Hamano
affdbe41bd Merge branch 'lo/repo-struct-z'
"git repo struct" learned to take "-z" as a synonym to "--format=nul".

* lo/repo-struct-z:
  repo: add -z as an alias for --format=nul to git-repo-structure
  repo: use [--format=... | -z] instead of [-z] in git-repo-info synopsis
  repo: remove blank line from Documentation/git-repo.adoc
2025-12-14 17:04:37 +09:00
Junio C Hamano
2378ebcb58 Merge branch 'kh/advise-w-git-help-in-branch'
A help message from "git branch" now mentions "git help" instead of
"man" when suggesting to read some documentation.

* kh/advise-w-git-help-in-branch:
  branch: advice using git-help(1) instead of man(1)
2025-12-14 17:04:37 +09:00
Junio C Hamano
c382988d7b Merge branch 'je/doc-pull'
Doc fixup.

* je/doc-pull:
  doc: git-pull: fix 'git --rebase abort' typo
2025-12-14 17:04:37 +09:00
Junio C Hamano
25ce0883fe Merge branch 'tc/meson-cross-compile-fix'
Build fix.

* tc/meson-cross-compile-fix:
  meson: use is_cross_build() where possible
  meson: only detect ICONV_OMITS_BOM if possible
  meson: ignore subprojects/.wraplock
2025-12-14 17:04:37 +09:00
Junio C Hamano
21787077bf Merge branch 'js/last-modified-with-sparse-checkouts'
"git last-modified" used to mishandle "--" to mark the beginning of
pathspec, which has been corrected.

* js/last-modified-with-sparse-checkouts:
  last-modified: support sparse checkouts
2025-12-14 17:04:37 +09:00
Junio C Hamano
84ca5a2457 Merge branch 'rs/diff-index-find-copies-harder-optim'
Halve the memory consumed by artificial filepairs created during
"git diff --find-copioes-harder", also making the operation run
faster.

* rs/diff-index-find-copies-harder-optim:
  diff-index: don't queue unchanged filepairs with diff_change()
2025-12-14 17:04:36 +09:00
Junio C Hamano
794c979889 Merge branch 'tc/last-modified-active-paths-optimization'
Recent optimization to "last-modified" command introduced use of
uninitialized block of memory, which has been corrected.

* tc/last-modified-active-paths-optimization:
  last-modified: fix use of uninitialized memory
2025-12-14 17:04:36 +09:00
Kristoffer Haugsbakk
9ba08b30a1 doc: replay: link section using markup
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-14 15:56:02 +09:00
Kristoffer Haugsbakk
03d7c9c457 replay: improve --contained and add to doc
There is no documentation for `--contained`.

Start by copying the text from `replay_options` in `builtin/
replay.c`. But some people think that the existing text is a
bit unclear; what does it mean for a branch to be contained
in a revision range? Let’s include the implied commits here:
the branches that point at commits in the range.

Also use “update” instead of “advance”. “Update” is the verb
commonly used in this context.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-14 15:56:02 +09:00
Kristoffer Haugsbakk
8467c95419 doc: replay: mention no output on conflicts
Some commands will produce output on stderr if there are conflicts, but
git-replay(1) is completely silent. Explicitly spell that out.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-14 15:56:01 +09:00
René Scharfe
007b8994d4 t4014: support Git version strings with spaces
git --version reports its version with the prefix "git version ".
Remove precisely this string instead of everything up to and including
the rightmost space to avoid butchering version strings that contain
spaces.  This helps Apple's release of Git, which reports its version
like this: "git version 2.50.1 (Apple Git-155)".

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-14 15:50:44 +09:00
Junio C Hamano
8ea9492cf3 cocci: use MEMZERO_ARRAY() a bit more
Existing code in files that have been fairly stable trigger the
"make coccicheck" suggestions due to the new check.

Rewrite them to use MEMZERO_ARRAY()

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-13 10:47:59 +09:00
Junio C Hamano
d2e4099968 coccicheck: emit the contents of cocci patch
Telling the user "you got some error messages" without showing what
the errors are is almost useless in CI environment, as the errors
cannot be examined without downloading build artifacts.

Arrange it to spew out the output when it fails.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-13 10:47:59 +09:00
Junio C Hamano
6362c9ce5e Merge branch 'tc/memzero-array' into jc/memzero-array
* tc/memzero-array:
  contrib/coccinelle: pass include paths to spatch(1)
  git-compat-util: introduce MEMZERO_ARRAY() macro
  last-modified: fix use of uninitialized memory
2025-12-13 10:39:23 +09:00
Derrick Stolee
e1588c270d scalar: alphabetize and simplify config
The config values set by Scalar went through an audit in the previous
changes, so now reorganize the settings and simplify their purpose.

First, alphabetize the config options, except put the platform-specific
options at the end. This groups two Windows-specific settings and only
one non-Windows setting.

Also, this removes the 'overwrite_on_reconfigure' setting for many of
these options. That setting made nearly all of these options "required"
for scalar enlistments, restricting use for users. Instead, now nearly
all options have removed this setting.

However, there is one setting that still has this, which is
index.skipHash, which was previously being set to _false_ when we
actually prefer the value of true. Keep the overwrite here to help
Scalar users upgrade to the new version. We may remove that overwrite in
the future once we belive that most of the users who have the false
value have upgraded to a version that overwrites that to 'true'.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-13 08:43:28 +09:00
Derrick Stolee
be667e40cb scalar: remove stale config values
These config values were added in the original Scalar contribution,
d0feac4e8c (scalar: 'register' sets recommended config and starts
maintenance, 2021-12-03), but were never fully checked for validity in
the upstream Git project. At the time, Scalar was only intended for the
contrib/ directory so did not have as rigorous of an investigation.

Each config option has its own justification for removal:

* core.preloadIndex: This value is true by default, now. Removing this
  causes some changes required to the tests that checked this config
  value. Use gui.gcwarning=false instead.

* core.fscache: This config does not exist in the core Git project, but
  is instead a config option for a Git for Windows feature.

* core.multiPackIndex: This config value is now enabled by default, so
  does not need to be called out specifically. It was originally
  included to make sure the background maintenance that created
  multi-pack-indexes would result in the expected performance
  improvements.

* credential.validate: This option is not something specific to Git but
  instead an older version of Git Credential Manager for Windows. That
  software was replaced several years ago by the cross-platform Git
  Credential Manger so this option is no longer needed to help users who
  were on that older software.

* pack.useSparse=true: This value is now Git's default as of de3a864114
  (config: set pack.useSparse=true by default, 2020-03-20) so we don't
  need it set by Scalar.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-13 08:43:28 +09:00
Derrick Stolee
05f28e4b3c scalar: use index.skipHash=true for performance
The index.skipHash config option has been set to 'false' by Scalar since
4933152cbb (scalar: enable path-walk during push via config, 2025-05-16)
but that commit message is trying to communicate the exact opposite:
that the 'true' value is what we want instead. This means that we've
been disabling this performance benefit for Scalar repos
unintentionally.

Fix this issue before we add justification for the config options set in
this list.

Oddly, enabling index.skipHash causes a test issue during 'test_commit'
in one of the Scalar tests when GIT_TEST_SPLIT_INDEX is enabled (as
caught by the linux-test-vars build). I'm fixing the test by disabling
the environment variable, but the issue should be resolved in a series
focused on the split index.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-13 08:43:27 +09:00
Derrick Stolee
48695fcde5 scalar: annotate config file with "set by scalar"
A repo may have config options set by 'scalar clone' or 'scalar
register' and then updated by 'scalar reconfigure'. It can be helpful to
point out which of those options were set by the latest scalar
recommendations.

Add "# set by scalar" to the end of each config option to assist users
in identifying why these config options were set in their repo. Use a new
helper method to simplify the two callsites.

Co-authored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-13 08:43:27 +09:00
K Jayatheerth
bab391761d pull: move options[] array into function scope
Unless there are good reasons, it is customary to have the options[]
array used with the parse-options API declared in function scope rather
than at file scope.

Move builtin/pull.c:cmd_pull()’s options[] array into the function to
match that convention.

Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-12 22:08:02 +09:00
Junio C Hamano
4d75f2aea7 FLEX_ARRAY: require platforms to support the C99 syntax
Before C99 syntax to express that the final member in a struct is an
array of unknown number of elements, i.e.,

	struct {
		...
		T flexible_array[];
	};

came along, GNU introduced their own extension to declare such a
member with 0 size, i.e.,

		T flexible_array[0];

and the compilers that did not understand even that were given a way
to emulate it by wasting one element, i.e.,

		T flexible_array[1];

As we are using more and more C99 language features, let's see if
the platforms that still need to resort to the historical forms of
flexible array member support are still there, by forcing all the
flex array definitions to use the C99 syntax and see if anybody
screams (in which case reverting the changes is rather easy).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-12 22:05:19 +09:00
René Scharfe
a4a77e41fa replay: move onto NULL check before first use
cmd_replay() aborts if the pointer "onto" is NULL after argument
parsing, e.g. when specifying a non-existing commit with --onto.
15cd4ef1f4 (replay: make atomic ref updates the default behavior,
2025-11-06) added code that dereferences this pointer before the check.
Switch their places to avoid a segmentation fault.

Reported-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-12 12:41:26 +09:00
Junio C Hamano
8cb4a11438 Merge branch 'sa/replay-atomic-ref-updates' into rs/replay-wrong-onto-fix
* sa/replay-atomic-ref-updates:
  replay: add replay.refAction config option
  replay: make atomic ref updates the default behavior
  replay: use die_for_incompatible_opt2() for option validation
2025-12-12 12:41:17 +09:00
Junio C Hamano
d4b732899e Makefile: help macOS novices by mentioning MacPorts
Since Aug 2006, the DarwinPorts project renamed themselves as
MacPorts.  Those who are not intimately familiar with the Opensource
ecosystem around macOS from olden days, the name DarwinPorts may not
ring a bell, even when they are using MacPorts.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-12 11:19:43 +09:00
Patrick Steinhardt
221a877d47 odb: write alternates via sources
Refactor writing of alternates so that the actual business logic is
structured around the object database source we want to write the
alternate to. Same as with the preceding commit, this will eventually
allow us to have different logic for writing alternates depending on the
backend used.

Note that after the refactoring we start to call
`odb_add_alternate_recursively()` unconditionally. This is fine though
as we know to skip adding sources that are tracked already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 18:39:37 +09:00
Patrick Steinhardt
f7dbd9fb2e odb: read alternates via sources
Adapt how we read alternates so that the interface is structured around
the object database source we're reading from. This will eventually
allow us to abstract away this behaviour with pluggable object databases
so that every format can have its own mechanism for listing alternates.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 18:39:37 +09:00
Patrick Steinhardt
3f42555322 odb: drop forward declaration of read_info_alternates()
Now that we have removed the mutual recursion in the preceding commit
it is not necessary anymore to have a forward declaration of the
`read_info_alternates()` function. Move the function and its
dependencies further up so that we can remove it.

Note that this commit also removes the function documentation of
`read_info_alternates()`. It's unclear what it's documenting, but it for
sure isn't documenting the modern behaviour of the function anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 18:39:37 +09:00
Patrick Steinhardt
430e0e0f2e odb: remove mutual recursion when parsing alternates
When adding an alternative object database source we not only have to
consider the added source itself, but we also have to add _its_ sources
to our database. We implement this via mutual recursion:

  1. We first call `link_alt_odb_entries()`.

  2. `link_alt_odb_entries()` calls `parse_alternates()`.

  3. We then add each alternate via `odb_add_alternate_recursively()`.

  4. `odb_add_alternate_recursively()` calls `link_alt_odb_entries()`
     again.

This flow is somewhat hard to follow, but more importantly it means that
parsing of alternates is somewhat tied to the recursive behaviour.

Refactor the function to remove the mutual recursion between adding
sources and parsing alternates. The parsing step thus becomes completely
oblivious to the fact that there is recursive behaviour going on at all.
The recursion is handled by `odb_add_alternate_recursively()` instead,
which now recurses with itself.

This refactoring allows us to move parsing of alternates into object
database sources in a subsequent step.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 18:39:36 +09:00
Patrick Steinhardt
dccfb39cdb odb: stop splitting alternate in odb_add_to_alternates_file()
When calling `odb_add_to_alternates_file()` we know to add the newly
added source to the object database in case we have already loaded
alternates. This is done so that we can make its objects accessible
immediately without having to fully reload all alternates.

The way we do this though is to call `link_alt_odb_entries()`, which
adds _multiple_ sources to the object database source in case we have
newline-separated entries. This behaviour is not documented in the
function documentation of `odb_add_to_alternates_file()`, and all
callers only ever pass a single directory to it. It's thus entirely
surprising and a conceptual mismatch.

Fix this issue by directly calling `odb_add_alternate_recursively()`
instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 18:39:36 +09:00
Patrick Steinhardt
d17673ef42 odb: move computation of normalized objdir into alt_odb_usable()
The function `alt_odb_usable()` receives as input the object database,
the path it's supposed to determine usability for as well as the
normalized path of the main object directory of the repository. The last
part is derived by the function's caller from the object database. As we
already pass the object database to `alt_odb_usable()` it is redundant
information.

Drop the extra parameter and compute the normalized object directory in
the function itself.

While at it, rename the function to `odb_is_source_usable()` to align it
with modern terminology.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 18:39:35 +09:00
Patrick Steinhardt
84cec5276e odb: resolve relative alternative paths when parsing
Parsing alternates and resolving potential relative paths is currently
handled in two separate steps. This has the effect that the logic to
retrieve alternates is not entirely self-contained. We want it to be
just that though so that we can eventually move the logic to list
alternates into the `struct odb_source`.

Move the logic to resolve relative alternative paths into
`parse_alternates()`. Besides bringing us a step closer towards the
above goal, it also neatly separates concerns of generating the list of
alternatives and linking them into the object database.

Note that we ignore any errors when the relative path cannot be
resolved. This isn't really a change in behaviour though: if the path
cannot be resolved to a directory then `alt_odb_usable()` still knows to
bail out.

While at it, rename the function to `odb_add_alternate_recursively()` to
more clearly indicate what its intent is and to align it with modern
terminology.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 18:39:35 +09:00
Patrick Steinhardt
1660496fc4 odb: refactor parsing of alternates to be self-contained
Parsing of the alternates file and environment variable is currently
split up across multiple different functions and is entangled with
`link_alt_odb_entries()`, which is responsible for linking the parsed
object database sources. This results in two downsides:

  - We have mutual recursion between parsing alternates and linking them
    into the object database. This is because we also parse alternates
    that the newly added sources may have.

  - We mix up the actual logic to parse the data and to link them into
    place.

Refactor the logic so that parsing of the alternates file is entirely
self-contained. Note that this doesn't yet fix the above two issues, but
it is a necessary step to get there.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 18:39:34 +09:00
Toon Claes
467860bc0b contrib/coccinelle: pass include paths to spatch(1)
In the previous commit a new coccinelle rule is added. But neiter
`make coccicheck` nor `meson compile coccicheck` did detect a case in
builtin/last-modified.c.

This case involves the field `scratch` in `struct last_modified`. This
field is of type `struct bitmap` and that struct has a member
`eword_t *words`. Both are defined in `ewah/ewok.h`. Now, while
builtin/last-modified.c does include that header (with the subdir in the
#include directive), it seems coccinelle does not process it. So it's
unaware of the type of `words` in the bitmap, and it doesn't recognize
the rule from previous commit that uses:

    type T;
    T *ptr;

Fix coccicheck by passing all possible include paths inside the Git
project so spatch(1) can find the headers and can determine the types.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 14:44:43 +09:00
Toon Claes
a67b902c94 git-compat-util: introduce MEMZERO_ARRAY() macro
Introduce a new macro MEMZERO_ARRAY() that zeroes the memory allocated
by ALLOC_ARRAY() and friends. And add coccinelle rule to enforce the use
of this macro.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 14:44:43 +09:00
Junio C Hamano
af0ed97e10 Merge branch 'tc/last-modified-active-paths-optimization' into tc/memzero-array
* tc/last-modified-active-paths-optimization:
  last-modified: fix use of uninitialized memory
2025-12-11 14:44:28 +09:00
Patrick Steinhardt
6ce9d558ce midx-write: skip rewriting MIDX with --stdin-packs unless needed
In `write_midx_internal()` we know to skip rewriting the multi-pack
index in case the existing one already covers all packs. This logic does
not know to handle `git multi-pack-index write --stdin-packs` though, so
we end up always rewriting the MIDX in this case even if the MIDX would
not change.

With our default maintenance strategy this isn't really much of a
problem, as git-gc(1) does not use the "--stdin-packs" option. But that
is changing with geometric repacking, where "--stdin-packs" is used to
explicitly select the packfiles part of the geometric sequence.

This issue can be demonstrated trivially with a benchmark in the Git
repository: executing `git repack --geometric=2 --write-midx -d` in the
Git repository takes more than 3 seconds only to end up with the same
multi-pack index as we already had before.

The logic that decides if we need to rewrite the MIDX only checks
whether the number of packfiles covered will change. That check is of
course too lenient for "--stdin-packs", as it could happen that we want
to cover a different-but-same-size set of packfiles. But there is no
inherent reason why we cannot handle "--stdin-packs".

Improve the logic to not only check for the number of packs, but to also
verify that we are asked to generate a MIDX for the _same_ packs. This
allows us to also skip no-op rewrites for "--stdin-packs".

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 12:09:59 +09:00
Patrick Steinhardt
b3bab9d272 midx-write: extract function to test whether MIDX needs updating
In `write_midx_internal()` we know to skip writing the new multi-pack
index in case it would be the same as the existing one. This logic does
not handle the `--stdin-packs` option yet though, so we end up always
rewriting the MIDX if that option is passed to us.

Extract the logic to decide whether or not to rewrite the MIDX into a
separate function. This will allow us to extend that feature in the next
commit to address the above issue.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 12:09:58 +09:00
Patrick Steinhardt
665d19ec7b midx: fix BUG() when getting preferred pack without a reverse index
The function `midx_preferred_pack()` returns the preferred pack for a
given multi-pack index. To compute the preferred pack we:

  1. Take the first position indexed by the MIDX in pseudo-pack order.

  2. Convert this pseudo-pack position into the MIDX position.

  3. We then look up the pack that corresponds to this MIDX position.

This reliably returns the preferred pack given that all of its contained
objects will be up front in pseudo-pack order.

The second step that turns the pseudo-pack order into MIDX order
requires the reverse index though, which may not exist for example when
the MIDX does not have a bitmap. And in that case one may easily hit a
bug:

    BUG: ../pack-revindex.c:491: pack_pos_to_midx: reverse index not yet loaded

In theory, `midx_preferred_pack()` already knows to handle the case
where no reverse index exists, as it calls `load_midx_revindex()` before
calling into `midx_preferred_pack()`. But we only check for negative
return values there, even though the function returns a positive error
code in case the reverse index does not exist.

Fix the issue by testing for a non-zero return value instead, same as
all the other callers of this function already do. While at it, document
the return value of `load_midx_revindex()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 12:09:58 +09:00
Karthik Nayak
b7b17ec8a6 fetch: fix failed batched updates skipping operations
Fix a regression introduced with batched updates in 0e358de64a (fetch:
use batched reference updates, 2025-05-19) when fetching references. In
the `do_fetch()` function, we jump to cleanup if committing the
transaction fails, regardless of whether using batched or atomic
updates. This skips three subsequent operations:

  - Update 'FETCH_HEAD' as part of `commit_fetch_head()`.

  - Add upstream tracking information via `set_upstream()`.

  - Setting remote 'HEAD' values when `do_set_head` is true.

For atomic updates, this is expected behavior. For batched updates,
we want to continue with these operations even if some refs fail to
update.

Skipping `commit_fetch_head()` isn't actually a regression because
'FETCH_HEAD' is already updated via `append_fetch_head()` when not
using '--atomic'. However, we add a test to validate this behavior.

Skipping the other two operations (upstream tracking and remote HEAD)
is a regression. Fix this by only jumping to cleanup when using
'--atomic', allowing batched updates to continue with post-fetch
operations. Add tests to prevent future regressions.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-10 20:59:58 +09:00
Karthik Nayak
8ff2eef8ad fetch: fix non-conflicting tags not being committed
The commit 0e358de64a (fetch: use batched reference updates, 2025-05-19)
updated the 'git-fetch(1)' command to use batched updates. This batches
updates to gain performance improvements. When fetching references, each
update is added to the transaction. Finally, when committing, individual
updates are allowed to fail with reason, while the transaction itself
succeeds.

One scenario which was missed here, was fetching tags. When fetching
conflicting tags, the `fetch_and_consume_refs()` function returns '1',
which skipped committing the transaction and directly jumped to the
cleanup section. This mean that no updates were applied. This also
extends to backfilling tags which is done when fetching specific
refspecs which contains tags in their history.

Fix this by committing the transaction when we have an error code and
not using an atomic transaction. This ensures other references are
applied even when some updates fail.

The cleanup section is reached with `retcode` set in several scenarios:

   - `truncate_fetch_head()`, `open_fetch_head()` and `prune_refs()` set
     `retcode` before the transaction is created, so no commit is
     attempted.

   - `fetch_and_consume_refs()` and `backfill_tags()` are the primary
     cases this fix targets, both setting a positive `retcode` to
     trigger the committing of the transaction.

This simplifies error handling and ensures future modifications to
`do_fetch()` don't need special handling for batched updates.

Add tests to check for this regression. While here, add a missing
cleanup from previous test.

Reported-by: David Bohman <debohman@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-10 20:59:58 +09:00
Aaron Plattner
3f5d1749e7 packfile: skip hash checks in add_promisor_object()
When is_promisor_object() is called for the first time, it lazily
initializes a set of all promisor objects by iterating through all
objects in promisor packs. For each object, add_promisor_object() calls
parse_object(), which decompresses and hashes the entire object.

For repositories with large pack files, this can take an extremely long
time. For example, on a production repository with a 176 GB promisor
pack:

 $ time ~/git/git/git-rev-list --objects --all --exclude-promisor-objects --quiet
 ________________________________________________________
 Executed in   76.10 mins    fish           external
    usr time   72.10 mins    1.83 millis   72.10 mins
    sys time    3.56 mins    0.17 millis    3.56 mins

add_promisor_object() just wants to construct the set of all promisor
objects, so it doesn't really need to verify the hash of every object.
Set PARSE_OBJECT_SKIP_HASH_CHECK to skip the hash check. This has the
side effect of skipping decompression of blob objects completely, saving
a significant amount of time:

 $ time ~/git/git/git-rev-list --objects --all --exclude-promisor-objects --quiet
 ________________________________________________________
 Executed in  124.70 secs    fish           external
    usr time   46.94 secs    0.00 millis   46.94 secs
    sys time   43.11 secs    1.03 millis   43.11 secs

Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-09 18:12:25 +09:00
Aaron Plattner
3c7c41d6b7 object: apply skip_hash and discard_tree optimizations to unknown blobs too
parse_object_with_flags() has an optimization to skip parsing blobs if
PARSE_OBJECT_SKIP_HASH_CHECK is set and the object hasn't been seen
before or might be a blob but hasn't been parsed yet. The latter can
happen, for example, if add_tree_entries() walks a path that references
a blob object that hasn't been seen before: lookup_blob() marks the
referenced oid as being a blob, but does not provide any additional
information about it until it is parsed.

It's possible for an object to be created without even a type, such as
when prepare_revision_walk() uses mark_uninteresting() to mark all
promisor objects as uninteresting. These objects have obj->parsed ==
false and obj->type == OBJ_NONE.

The skip_hash optimization does not consider this kind of object, so
parse_object_with_flags() proceeds to fully parse the object to
determine its type.

Improve the optimization by applying it to OBJ_NONE objects as well as
OBJ_BLOB ones. Apply a similar fix for trees.

Fixes: 8db2dad7a0 ("parse_object(): check on-disk type of suspected blob")
Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-09 18:12:24 +09:00
Junio C Hamano
e85ae279b0 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-09 07:54:56 +09:00
Junio C Hamano
bbefa15ff5 Merge branch 'en/replay-doc-revision-range'
The use of "revision" (a connected set of commits) has been
clarified in the "git replay" documentation.

* en/replay-doc-revision-range:
  Documentation/git-replay.adoc: fix errors around revision range
2025-12-09 07:54:56 +09:00
Junio C Hamano
7fc0b33b5d Merge branch 'yc/xdiff-patience-optim'
The way patience diff finds LCS has been optimized.

* yc/xdiff-patience-optim:
  xdiff: optimize patience diff's LCS search
2025-12-09 07:54:55 +09:00
Junio C Hamano
fe0e6ffa19 Merge branch 'bc/zsh-testsuite'
A few tests have been updated to work under the shell compatible
mode of zsh.

* bc/zsh-testsuite:
  t5564: fix test hang under zsh's sh mode
  t0614: use numerical comparison with test_line_count
2025-12-09 07:54:54 +09:00
Junio C Hamano
c64b234a0b Merge branch 'pw/replay-exclude-gpgsig-fix'
"git replay" forgot to omit the "gpgsig-sha256" extended header
from the resulting commit the same way it omits "gpgsig", which has
been corrected.

* pw/replay-exclude-gpgsig-fix:
  replay: do not copy "gpgsign-sha256" header
2025-12-09 07:54:54 +09:00
Matthew Hughes
d4bc39a4d9 config: document 'gui.GCWarning'
While investigating the config options set by 'scalar' I noticed this
one wasn't documented.

Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-09 07:38:56 +09:00
Kristoffer Haugsbakk
41d425008a doc: send-email: fix broken list continuation
The list continuation has to be “immediately adjacent to the block
being attached”.[1]

[1]: https://web.archive.org/web/20251208172615/https://docs.asciidoctor.org/asciidoc/latest/lists/continuation/

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-09 07:27:13 +09:00
Junio C Hamano
48176f953f connect: plug protocol capability leak
When pushing to a set of remotes using a nickname for the group, the
client initializes the connection to each remote, talks to the
remote and reads and parses capabilities line, and holds the
capabilities in a file-scope static variable server_capabilities_v1.

There are a few other such file-scope static variables, and these
connections cannot be parallelized until they are refactored to a
structure that keeps track of active connections.

Which is *not* the theme of this patch ;-)

For a single connection, the server_capabilities_v1 variable is
initialized to NULL (at the program initialization), populated when
we talk to the other side, used to look up capabilities of the other
side possibly multiple times, and the memory is held by the variable
until program exit, without leaking.  When talking to multiple remotes,
however, the server capabilities from the second connection overwrites
without freeing the one from the first connection, which leaks.

    ==1080970==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 421 byte(s) in 2 object(s) allocated from:
	#0 0x5615305f849e in strdup (/home/gitster/g/git-jch/bin/bin/git+0x2b349e) (BuildId: 54d149994c9e85374831958f694bd0aa3b8b1e26)
	#1 0x561530e76cc4 in xstrdup /home/gitster/w/build/wrapper.c:43:14
	#2 0x5615309cd7fa in process_capabilities /home/gitster/w/build/connect.c:243:27
	#3 0x5615309cd502 in get_remote_heads /home/gitster/w/build/connect.c:366:4
	#4 0x561530e2cb0b in handshake /home/gitster/w/build/transport.c:372:3
	#5 0x561530e29ed7 in get_refs_via_connect /home/gitster/w/build/transport.c:398:9
	#6 0x561530e26464 in transport_push /home/gitster/w/build/transport.c:1421:16
	#7 0x561530800bec in push_with_options /home/gitster/w/build/builtin/push.c:387:8
	#8 0x5615307ffb99 in do_push /home/gitster/w/build/builtin/push.c:442:7
	#9 0x5615307fe926 in cmd_push /home/gitster/w/build/builtin/push.c:664:7
	#10 0x56153065673f in run_builtin /home/gitster/w/build/git.c:506:11
	#11 0x56153065342f in handle_builtin /home/gitster/w/build/git.c:779:9
	#12 0x561530655b89 in run_argv /home/gitster/w/build/git.c:862:4
	#13 0x561530652cba in cmd_main /home/gitster/w/build/git.c:984:19
	#14 0x5615308dda0a in main /home/gitster/w/build/common-main.c:9:11
	#15 0x7f051651bca7 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

    SUMMARY: AddressSanitizer: 421 byte(s) leaked in 2 allocation(s).

Free the capablities data for the previous server before overwriting
it with the next server to plug this leak.

The added test fails without the freeing with SANITIZE=leak; I
somehow couldn't get it fail reliably with SANITIZE=leak,address
though.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-09 07:11:42 +09:00
Kristoffer Haugsbakk
8cbbdc92f7 doc: join default pre-commit paragraphs
Join two paragraphs that start with the standard “The default <hook>,
when enabled” into one and put it at the end of the “pre-commit”
section.

The trailing whitespace paragraph was added in the first commit for the
doc, in 6d35cc76 (Document hooks., 2005-09-02). Then 3e14dd2c (mention
use of "hooks.allownonascii" in "man githooks", 2019-02-20) updated the
“pre-commit” section to mention the non-ASCII check that was added in
d00e364d.[1] But this paragraph was added one-past the original
“default” paragraph, after the env. variable paragraph, and starts
exactly the same. That causes the flow of this section to feel
off (paragraphs in order):

1. Invoked by <cmd> and what parameters it takes
2. The default 'pre-commit' hook catches introduction of trailing
   whitespace
3. `GIT_EDITOR=:`
4. The default pre-commit' hook catches introduction of non-ASCII
   filenames

Let’s instead join these two paragrahs and explain the whole behavior of
the default script.

† 1: Extend sample pre-commit hook to check for non ascii filenames,
     2009-05-19

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-08 22:20:14 +09:00