Commit Graph

12989 Commits

Author SHA1 Message Date
Christian Couder
2f8fd208c3 gpg-interface: refactor 'enum sign_mode' parsing
The definition of 'enum sign_mode' as well as its parsing code are in
"builtin/fast-export.c". This was fine because `git fast-export` was the
only command with '--signed-tags=<mode>' or '--signed-commits=<mode>'
options.

In a following commit, we are going to add a similar option to `git
fast-import`, which will be simpler, easier and cleaner if we can reuse
the 'enum sign_mode' defintion and parsing code.

So let's move that definition and parsing code from
"builtin/fast-export.c" to "gpg-interface.{c,h}".

While at it, let's fix a small indentation issue with the arguments of
parse_opt_sign_mode().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17 11:18:28 -07:00
Junio C Hamano
95a8428323 Merge branch 'tc/last-modified'
A new command "git last-modified" has been added to show the closest
ancestor commit that touched each path.

* tc/last-modified:
  last-modified: use Bloom filters when available
  t/perf: add last-modified perf script
  last-modified: new subcommand to show when files were last modified
2025-09-08 14:54:35 -07:00
Junio C Hamano
576e0b6eb3 Merge branch 'ds/ls-files-lazy-unsparse'
"git ls-files <pathspec>..." should not necessarily have to expand
the index fully if a sparsified directory is excluded by the
pathspec; the code is taught to expand the index on demand to avoid
this.

* ds/ls-files-lazy-unsparse:
  ls-files: conditionally leave index sparse
2025-09-08 14:54:35 -07:00
Junio C Hamano
c64ec662d0 Merge branch 'jk/describe-blob'
"git describe <blob>" misbehaves and/or crashes in some corner
cases, which has been taught to exit with failure gracefully.

* jk/describe-blob:
  describe: pass commit to describe_commit()
  describe: handle blob traversal with no commits
  describe: catch unborn branch in describe_blob()
  describe: error if blob not found
  describe: pass oid struct by const pointer
2025-08-29 09:44:38 -07:00
Toon Claes
8d9a7cdfda last-modified: use Bloom filters when available
Our 'git last-modified' performs a revision walk, and computes a diff at
each point in the walk to figure out whether a given revision changed
any of the paths it considers interesting.

When changed-path Bloom filters are available, we can avoid computing
many such diffs. Before computing a diff, we first check if any of the
remaining paths of interest were possibly changed at a given commit by
consulting its Bloom filter. If any of them are, we are resigned to
compute the diff.

If none of those queries returned "maybe", we know that the given commit
doesn't contain any changed paths which are interesting to us. So, we
can avoid computing it in this case.

Comparing the perf test results on git.git:

    Test                                        HEAD~             HEAD
    ------------------------------------------------------------------------------------
    8020.1: top-level last-modified             4.49(4.34+0.11)   2.22(2.05+0.09) -50.6%
    8020.2: top-level recursive last-modified   5.64(5.45+0.11)   5.62(5.30+0.11) -0.4%
    8020.3: subdir last-modified                0.11(0.06+0.04)   0.07(0.03+0.04) -36.4%

Based-on-patch-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-28 16:44:58 -07:00
Toon Claes
32f74582bc last-modified: new subcommand to show when files were last modified
Similar to git-blame(1), introduce a new subcommand
git-last-modified(1). This command shows the most recent modification to
paths in a tree. It does so by expanding the tree at a given commit,
taking note of the current state of each path, and then walking
backwards through history looking for commits where each path changed
into its final commit ID.

Based-on-patch-by: Jeff King <peff@peff.net>
Improved-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-28 16:44:58 -07:00
Derrick Stolee
681f26bccc ls-files: conditionally leave index sparse
When running 'git ls-files' with a pathspec, the index entries get
filtered according to that pathspec before iterating over them in
show_files().  In 78087097b8 (ls-files: add --sparse option,
2021-12-22), this iteration was prefixed with a check for the '--sparse'
option which allows the command to output directory entries; this
created a pre-loop call to ensure_full_index().

However, when a user runs 'git ls-files' where the pathspec matches
directories that are recursively matched in the sparse-checkout, there
are not any sparse directories that match the pathspec so they would not
be written to the output. The expansion in this case is just a
performance drop for no behavior difference.

Replace this global check to expand the index with a check inside the
loop for a matched sparse directory. If we see one, then expand the
index and continue from the current location. This is safe since the
previous entries in the index did not have any sparse directories and
thus would remain stable in this expansion.

A test in t1092 confirms that this changes the behavior.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-28 08:02:37 -07:00
Junio C Hamano
ebb45da976 Merge branch 'lo/repo-info'
A new subcommand "git repo" gives users a way to grab various
repository characteristics.

* lo/repo-info:
  repo: add the --format flag
  repo: add the field layout.shallow
  repo: add the field layout.bare
  repo: add the field references.format
  repo: declare the repo command
2025-08-25 14:22:04 -07:00
Junio C Hamano
eed447dd95 Merge branch 'ps/commit-graph-wo-globals'
Remove dependency on the_repository and other globals from the
commit-graph code, and other changes unrelated to de-globaling.

* ps/commit-graph-wo-globals:
  commit-graph: stop passing in redundant repository
  commit-graph: stop using `the_repository`
  commit-graph: stop using `the_hash_algo`
  commit-graph: refactor `parse_commit_graph()` to take a repository
  commit-graph: store the hash algorithm instead of its length
  commit-graph: stop using `the_hash_algo` via macros
2025-08-25 14:22:03 -07:00
Junio C Hamano
a3c6459ab6 Merge branch 'dk/help-all'
"git cmd --help-all" now works outside repositories.

* dk/help-all:
  builtin: also setup gently for --help-all
  parse-options: refactor flags for usage_with_options_internal
2025-08-25 14:22:00 -07:00
Junio C Hamano
9d6e319ec5 Merge branch 'ac/deglobal-fmt-merge-log-config'
Code clean-up.

* ac/deglobal-fmt-merge-log-config:
  builtin/fmt-merge-msg: stop depending on 'the_repository'
  environment: remove the global variable 'merge_log_config'
2025-08-22 13:13:21 -07:00
Junio C Hamano
72e4eb56f0 Merge branch 'jc/diff-no-index-in-subdir'
"git diff --no-index" run inside a subdirectory under control of a
Git repository operated at the top of the working tree and stripped
the prefix from the output, and oddballs like "-" (stdin) did not
work correctly because of it.  Correct the set-up by undoing what
the set-up sequence did to cwd and prefix.

* jc/diff-no-index-in-subdir:
  diff: --no-index should ignore the worktree
2025-08-22 13:13:21 -07:00
Junio C Hamano
c72d5bbf49 Merge branch 'ms/refs-list'
The "list" subcommand of "git refs" acts as a front-end for
"git for-each-ref".

* ms/refs-list:
  t: add test for git refs list subcommand
  t6300: refactor tests to be shareable
  builtin/refs: add list subcommand
  builtin/for-each-ref: factor out core logic into a helper
  builtin/for-each-ref: align usage string with the man page
  doc: factor out common option
2025-08-22 13:13:20 -07:00
Junio C Hamano
54fef16542 Merge branch 'jc/strbuf-split'
Arrays of strbuf is often a wrong data structure to use, and
strbuf_split*() family of functions that create them often have
better alternatives.

Update several code paths and replace strbuf_split*().

* jc/strbuf-split:
  trace2: do not use strbuf_split*()
  trace2: trim_trailing_newline followed by trim is a no-op
  sub-process: do not use strbuf_split*()
  environment: do not use strbuf_split*()
  config: do not use strbuf_split()
  notes: do not use strbuf_split*()
  merge-tree: do not use strbuf_split*()
  clean: do not use strbuf_split*() [part 2]
  clean: do not pass the whole structure when it is not necessary
  clean: do not use strbuf_split*() [part 1]
  clean: do not pass strbuf by value
  wt-status: avoid strbuf_split*()
2025-08-21 13:47:00 -07:00
Junio C Hamano
971ba42dd4 Merge branch 'jc/string-list-split'
string_list_split*() family of functions have been extended to
simplify common use cases.

* jc/string-list-split:
  string-list: split-then-remove-empty can be done while splitting
  string-list: optionally omit empty string pieces in string_list_split*()
  diff: simplify parsing of diff.colormovedws
  string-list: optionally trim string pieces split by string_list_split*()
  string-list: unify string_list_split* functions
  string-list: align string_list_split() with its _in_place() counterpart
  string-list: report programming error with BUG
2025-08-21 13:46:59 -07:00
Junio C Hamano
5a404a70c7 Merge branch 'rs/describe-with-prio-queue'
"git describe" has been optimized by using better data structure.

* rs/describe-with-prio-queue:
  describe: use prio_queue_replace()
  describe: use prio_queue
2025-08-21 13:46:59 -07:00
Junio C Hamano
9a85fa8406 Merge branch 'ps/remote-rename-fix'
"git remote rename origin upstream" failed to move origin/HEAD to
upstream/HEAD when origin/HEAD is unborn and performed other
renames extremely inefficiently, which has been corrected.

* ps/remote-rename-fix:
  builtin/remote: only iterate through refs that are to be renamed
  builtin/remote: rework how remote refs get renamed
  builtin/remote: determine whether refs need renaming early on
  builtin/remote: fix sign comparison warnings
  refs: simplify logic when migrating reflog entries
  refs: pass refname when invoking reflog entry callback
2025-08-21 13:46:58 -07:00
Junio C Hamano
c3c8b6910a Merge branch 'ps/reflog-migrate-fixes'
"git refs migrate" to migrate the reflog entries from a refs
backend to another had a handful of bugs squashed.

* ps/reflog-migrate-fixes:
  refs: fix invalid old object IDs when migrating reflogs
  refs: stop unsetting REF_HAVE_OLD for log-only updates
  refs/files: detect race when generating reflog entry for HEAD
  refs: fix identity for migrated reflogs
  ident: fix type of string length parameter
  builtin/reflog: implement subcommand to write new entries
  refs: export `ref_transaction_update_reflog()`
  builtin/reflog: improve grouping of subcommands
  Documentation/git-reflog: convert to use synopsis type
2025-08-21 13:46:57 -07:00
Jeff King
7c10e48e81 describe: pass commit to describe_commit()
There's a call in describe_commit() to lookup_commit_reference(), but we
don't check the return value. If it returns NULL, we'll segfault as we
immediately dereference the result.

In practice this can never happen, since all callers pass an oid which
came from a "struct commit" already. So we can make this more obvious
by just taking that commit struct in the first place.

Reported-by: Cheng <prophecheng@stu.pku.edu.cn>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-20 09:08:57 -07:00
Jeff King
8cfd4ac215 describe: handle blob traversal with no commits
When describing a blob, we traverse from HEAD, remembering each commit
we saw, and then checking each blob to report the containing commit.
But if we haven't seen any commits at all, we'll segfault (we store the
"current" commit as an oid initialized to the null oid, causing
lookup_commit_reference() to return NULL).

This shouldn't be able to happen normally. We always start our traversal
at HEAD, which must be a commit (a property which is enforced by the
refs code). But you can trigger the segfault like this:

  blob=$(echo foo | git hash-object -w --stdin)
  echo $blob >.git/HEAD
  git describe $blob

We can instead catch this case and return an empty result, which hits
the usual "we didn't find $blob while traversing HEAD" error.

This is a minor lie in that we did "find" the blob. And this even hints
at a bigger problem in this code: what if the traversal pointed to the
blob as _not_ part of a commit at all, but we had previously filled in
the recorded "current commit"? One could imagine this happening due to a
tag pointing directly to the blob in question.

But that can't happen, because we only traverse from HEAD, never from
any other refs. And the intent of the blob-describing code is to find
blobs within commits.

So I think this matches the original intent as closely as we can (and
again, this segfault cannot be triggered without corrupting your
repository!).

The test here does not use the formula above, which works only for the
files backend (and not reftables). Instead we use another loophole to
create the bogus state using only Git commands. See the comment in the
test for details.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-20 09:06:02 -07:00
Jeff King
c6478715a5 describe: catch unborn branch in describe_blob()
When describing a blob, we search for it by traversing from HEAD. We do
this by feeding the name HEAD to setup_revisions(). But if we are on an
unborn branch, this will fail with a confusing message:

  $ git describe $blob
  fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

It is OK for this to be an error (we cannot find $blob in an empty
traversal, so we'd eventually complain about that). But the error
message could be more helpful.

Let's resolve HEAD ourselves and pass the resolved object id to
setup_revisions(). If resolving fails, then we can print a more useful
message.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18 14:23:43 -07:00
Jeff King
db2664b6f7 describe: error if blob not found
If describe_blob() does not find the blob in question, it returns an
empty strbuf, and we print an empty line. This differs from
describe_commit(), which always either returns an answer or calls die()
itself. As the blob function was bolted onto the command afterwards, I
think its behavior is not intentional, and it is just a bug that it does
not report an error.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18 14:23:43 -07:00
Jeff King
e715f77682 describe: pass oid struct by const pointer
We pass a "struct object_id" to describe_blob() by value. This isn't
wrong, as an oid is composed only of copy-able values. But it's unusual;
typically we pass structs by const pointer, including object_ids. Let's
do so.

It similarly makes sense for us to hold that pointer in the callback
data (rather than yet another copy of the oid).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18 14:23:42 -07:00
Lucas Seiki Oshiro
a81224d128 repo: add the --format flag
Add the --format flag to git-repo-info. By using this flag, the users
can choose the format for obtaining the data they requested.

Given that this command can be used for generating input for other
applications and for being read by end users, it requires at least two
formats: one for being read by humans and other for being read by
machines. Some other Git commands also have two output formats, notably
git-config which was the inspiration for the two formats that were
chosen here:

- keyvalue, where the retrieved data is printed one per line, using =
  for delimiting the key and the value. This is the default format,
  targeted for end users.
- nul, where the retrieved data is separated by NUL characters, using
  the newline character for delimiting the key and the value. This
  format is targeted for being read by machines.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:41 -07:00
Lucas Seiki Oshiro
e52cd654c9 repo: add the field layout.shallow
This commit is part of the series that introduces the new subcommand
git-repo-info.

The flag `--is-shallow-repository` from git-rev-parse is used for
retrieving whether the repository is shallow. This way, it is used for
querying repository metadata, fitting in the purpose of git-repo-info.

Then, add a new field `layout.shallow` to the git-repo-info subcommand
containing that information.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:40 -07:00
Lucas Seiki Oshiro
acf2669b54 repo: add the field layout.bare
This commit is part of the series that introduces the new subcommand
git-repo-info.

The flag --is-bare-repository from git-rev-parse is used for retrieving
whether the current repository is bare. This way, it is used for
querying repository metadata, fitting in the purpose of git-repo-info.

Then, add a new field layout.bare to the git-repo-info subcommand
containing that information.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:40 -07:00
Lucas Seiki Oshiro
9adb8a7fd1 repo: add the field references.format
This commit is part of the series that introduces the new subcommand
git-repo-info.

The flag `--show-ref-format` from git-rev-parse is used for retrieving
the reference format (i.e. `files` or `reftable`). This way, it is
used for querying repository metadata, fitting in the purpose of
git-repo-info.

Add a new field `references.format` to the repo-info subcommand
containing that information.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:40 -07:00
Lucas Seiki Oshiro
ab94bb8000 repo: declare the repo command
Currently, `git rev-parse` covers a wide range of functionality not
directly related to parsing revisions, as its name suggests. Over time,
many features like parsing datestrings, options, paths, and others
were added to it because there wasn't a more appropriate command
to place them.

Create a new Git command called `repo`. `git repo` will be the main
command for obtaining the information about a repository (such as
metadata and metrics).

Also declare a subcommand for `repo` called `info`. `git repo info`
will bring the functionality of retrieving repository-related
information currently returned by `rev-parse`.

Add the required documentation and build changes to enable usage of
this subcommand.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:39 -07:00
Patrick Steinhardt
7be9e410b2 commit-graph: stop passing in redundant repository
Many of the commit-graph related functions take in both a repository and
the object database source (directly or via `struct commit_graph`) for
which we are supposed to load such a commit-graph. In the best case this
information is simply redundant as the source already contains a
reference to its owning object database, which in turn has a reference
to its repository. In the worst case this information could even
mismatch when passing in a source that doesn't belong to the same
repository.

Refactor the code so that we only pass in the object database source in
those cases.

There is one exception though, namely `load_commit_graph_chain_fd_st()`,
which is responsible for loading a commit-graph chain. It is expected
that parts of the commit-graph chain aren't located in the same object
source as the chain file itself, but in a different one. Consequently,
this function doesn't work on the source level but on the database level
instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-15 09:34:48 -07:00
Patrick Steinhardt
ddacfc7466 commit-graph: stop using the_repository
There's still a bunch of uses of `the_repository` in "commit-graph.c",
which we want to stop using due to it being a global variable. Refactor
the code to stop using `the_repository` in favor of the repository
provided via the calling context.

This allows us to drop the `USE_THE_REPOSITORY_VARIABLE` macro.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-15 09:34:48 -07:00
Patrick Steinhardt
89cc9b9adf commit-graph: stop using the_hash_algo
Stop using `the_hash_algo` as it implicitly relies on `the_repository`.
Instead, we either use the hash algo provided via the context or, if
there is no such hash algo, we use `the_repository` explicitly. Such
uses will be removed in subsequent commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-15 09:34:47 -07:00
Junio C Hamano
22dd6abc32 Merge branch 'rs/merge-compact-summary'
Hotfix.

* rs/merge-compact-summary:
  merge: don't document non-existing --compact-summary argument
2025-08-11 21:30:16 -07:00
Junio C Hamano
10fa89aadc Merge branch 'rs/for-each-ref-start-after-marker-fix'
Hotfix.

* rs/for-each-ref-start-after-marker-fix:
  for-each-ref: call --start-after argument "marker"
2025-08-11 21:30:15 -07:00
Ayush Chandekar
22d421fed9 builtin/fmt-merge-msg: stop depending on 'the_repository'
Refactor builtin/fmt-merge-msg.c to remove the dependancy on the global
'the_repository'. Remove the 'UNUSED' macro from the 'struct repository'
parameter and replace 'git_config()' with 'repo_config()' so that
configuration is read from the passed repository. Also, add a test to
make sure that "git fmt-merge-msg -h" can be called outside a
repository.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11 09:19:40 -07:00
Ayush Chandekar
9a49aef8dc environment: remove the global variable 'merge_log_config'
The global variable 'merge_log_config', set via the "merge.log" or
"merge.summary" settings, is only used in 'cmd_fmt_merge_msg()' and
'cmd_merge()' to adjust the 'shortlog_len' variable.

Remove 'merge_log_config' globally and localize it in
'cmd_fmt_merge_msg()' and 'cmd_merge()'. Set its value by passing it in
'fmt_merge_msg_config()' by passing its pointer to the function via the
callback parameter.

This change is part of an ongoing effort to eliminate global variables,
improve modularity and help libify the codebase.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11 09:16:55 -07:00
Junio C Hamano
e1d3d61a45 diff: --no-index should ignore the worktree
The act of giving "--no-index" tells Git to pretend that the current
directory is not under control of any Git index or repository, so
even when you happen to be in a Git controlled working tree, where
in that working tree should not matter.

But the start-up sequence tries to discover the top of the working
tree and chdir(2)'s there, even before Git passes control to the
subcommand being run.  When diff_no_index() starts running, it
starts at a wrong (from the end-user's point of view who thinks
"git diff --no-index" is merely a better version of GNU diff)
directory, and the original directory the user started the command
is at "prefix".

Because the paths given from argv[] have already been adjusted to
account for this path shuffling by prepending the prefix, and
showing the resulting path by stripping the prefix, the effect of
these nonsense operations (nonsense in the context of "--no-index",
that is) is usually not observable.

Except for special cases like "-", where it is not preprocessed by
prepending the prefix.

Instead of papering over by adding more special cases only to cater
to the no-index codepath in the generic code, drive the diff
machinery more faithfully to what is going on.  If the user started
"git diff --no-index" in directory X/Y/Z in a working tree
controlled by Git, and the start up sequence of Git chdir(2)'ed up
to directory X and left Y/Z in the prefix, revert the effect of the
start up sequence by chdir'ing back to Y/Z and emptying the prefix.

Reported-by: Gregoire Geis <opensource@gregoirege.is>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-09 17:22:01 -07:00
René Scharfe
ad459fd44c merge: don't document non-existing --compact-summary argument
3a54f5bd5d (merge/pull: add the "--compact-summary" option, 2025-06-12)
added the option --compact-summary to both merge and pull.  It takes no
no argument, but for merge it got an argument help string.  Remove it,
since it is unnecessary.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-09 17:11:19 -07:00
René Scharfe
51d9ed581f for-each-ref: call --start-after argument "marker"
dabecb9db2 (for-each-ref: introduce a '--start-after' option,
2025-07-15) added the option --start-after and referred to its argument
as "marker" in documentation and usage string, but not in the option's
short help.  Use "marker" there as well for consistency and brevity.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-09 17:10:39 -07:00
D. Ben Knoble
129b3632f3 builtin: also setup gently for --help-all
Git experts often check the help summary of a command to make sure they
spell options right when suggesting advice to colleagues. Further, they
might check hidden options when responding to queries about deprecated
options like git-rebase(1)'s "preserve merges" option. But some commands
don't support "--help-all" outside of a git directory. Running (for
example)

    git rebase --help-all

outside a directory fails in "setup_git_directory", erroring with the
localized form of

    fatal: not a git repository (or any of the parent directories): .git

Like 99caeed05d (Let 'git <command> -h' show usage without a git dir,
2009-11-09), we want to show the "--help-all" output even without a git
dir. Make "--help-all" where we expect "-h" to mean
"setup_git_directory_gently", and interpose early in the natural place
("show_usage_with_options_if_asked").

Do the same for usage callers with show_usage_if_asked.

The exception is merge-recursive, whose help block doesn't use newer
APIs.

Best-viewed-with: --ignore-space-change
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-08 11:13:12 -07:00
Junio C Hamano
aa4fb2485c Merge branch 'dl/squelch-maybe-uninitialized'
Squelch false-positive compiler warning.

* dl/squelch-maybe-uninitialized:
  t/unit-tests/clar: fix -Wmaybe-uninitialized with -Og
  remote: bail early from set_head() if missing remote name
2025-08-07 08:14:38 -07:00
Junio C Hamano
0349fa013e Merge branch 'jk/revert-squelch-compiler-warning'
Squelch false-positive compiler warning.

* jk/revert-squelch-compiler-warning:
  revert: initialize const value
2025-08-07 08:14:37 -07:00
Patrick Steinhardt
16c4fa26b9 builtin/remote: only iterate through refs that are to be renamed
When renaming a remote we also need to rename all references
accordingly. But while we only need to rename references that are
contained in the "refs/remotes/$OLDNAME/" namespace, we end up using
`refs_for_each_rawref()` that iterates through _all_ references. We know
to exit early in the callback in case we see an irrelevant reference,
but ultimately this is still a waste of compute as we knowingly iterate
through references that we won't ever care about.

Improve this by using `refs_for_each_rawref_in()`, which knows to only
iterate through (potentially broken) references in a given prefix.

The following benchmark renames a remote with a single reference in a
repository that has 100k unrelated references. This shows a sizeable
improvement with the "files" backend:

    Benchmark 1: rename remote (refformat = files, revision = HEAD~)
      Time (mean ± σ):      42.6 ms ±   0.9 ms    [User: 29.1 ms, System: 8.4 ms]
      Range (min … max):    40.1 ms …  43.3 ms    10 runs

    Benchmark 2: rename remote (refformat = files, revision = HEAD)
      Time (mean ± σ):      31.7 ms ±   4.0 ms    [User: 19.6 ms, System: 6.9 ms]
      Range (min … max):    27.1 ms …  36.0 ms    10 runs

    Summary
      rename remote (refformat = files, revision = HEAD) ran
        1.35 ± 0.17 times faster than rename remote (refformat = files, revision = HEAD~)

The "reftable" backend shows roughly the same absolute improvement, but
given that it's already significantly faster than the "files" backend
this translates to a much larger relative improvement:

    Benchmark 1: rename remote (refformat = reftable, revision = HEAD~)
      Time (mean ± σ):      18.2 ms ±   0.5 ms    [User: 12.7 ms, System: 3.0 ms]
      Range (min … max):    17.3 ms …  21.4 ms    110 runs

    Benchmark 2: rename remote (refformat = reftable, revision = HEAD)
      Time (mean ± σ):       8.8 ms ±   0.5 ms    [User: 3.8 ms, System: 2.9 ms]
      Range (min … max):     7.5 ms …   9.9 ms    167 runs

    Summary
      rename remote (refformat = reftable, revision = HEAD) ran
        2.07 ± 0.12 times faster than rename remote (refformat = reftable, revision = HEAD~)

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 14:19:30 -07:00
Patrick Steinhardt
68d090a682 builtin/remote: rework how remote refs get renamed
It was recently reported [1] that renaming a remote that has dangling
symrefs is broken. This issue can be trivially reproduced:

    $ git init repo
    Initialized empty Git repository in /tmp/repo/.git/
    $ cd repo/
    $ git remote add origin /dev/null
    $ git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/master
    $ git remote rename origin renamed
    $ git symbolic-ref refs/remotes/origin/HEAD
    refs/remotes/origin/master
    $ git symbolic-ref refs/remotes/renamed/HEAD
    fatal: ref refs/remotes/renamed/HEAD is not a symbolic ref

As one can see, the "HEAD" reference did not get renamed but stays in
the same place. There are two issues here:

  - We use `refs_resolve_ref_unsafe()` to resolve references, but we
    don't pass the `RESOLVE_REF_NO_RECURSE` flag. Consequently, if the
    reference does not resolve, the function will fail and we thus
    ignore this branch.

  - We use `refs_for_each_ref()` to iterate through the old remote's
    references, but that function ignores broken references.

Both of these issues are easy to fix. But having a closer look at the
logic that renames remote references surfaces that it leaves a lot to be
desired overall.

The problem is that we're using O(|refs| + |symrefs| * 2) many reference
transactions to perform the renames. We first delete all symrefs, then
individually rename every direct reference and finally we recreate the
symrefs. On the one hand this isn't even remotely an atomic operation,
so if we hit any error we'll already have deleted all references.

But more importantly it is also extremely inefficient. The number of
transactions for symrefs doesn't really bother us too much, as there
should generally only be a single symref anyway ("HEAD"). But the
renames are very expensive:

  - For the "reftable" backend we perform auto-compaction after every
    single rename, which does add up.

  - For the "files" backend we potentially have to rewrite the
    "packed-refs" file on every single rename in case they are packed.
    The consequence here is quadratic runtime performance. Renaming a
    100k references takes hours to complete.

Refactor the code to use a single transaction to perform all the
reference updates atomically, which speeds up the transaction quite
significantly:

    Benchmark 1: rename remote (refformat = files, revision = HEAD~)
      Time (mean ± σ):     238.770 s ± 13.857 s    [User: 91.473 s, System: 143.793 s]
      Range (min … max):   204.863 s … 247.699 s    10 runs

    Benchmark 2: rename remote (refformat = files, revision = HEAD)
      Time (mean ± σ):      2.103 s ±  0.036 s    [User: 0.360 s, System: 1.313 s]
      Range (min … max):    2.011 s …  2.141 s    10 runs

    Summary
      rename remote (refformat = files, revision = HEAD) ran
      113.53 ± 6.87 times faster than rename remote (refformat = files, revision = HEAD~)

For the "reftable" backend we see a significant speedup, as well, but
given that we don't have quadratic runtime behaviour there it's way less
extreme:

    Benchmark 1: rename remote (refformat = reftable, revision = HEAD~)
      Time (mean ± σ):      8.604 s ±  0.539 s    [User: 4.985 s, System: 2.368 s]
      Range (min … max):    7.880 s …  9.556 s    10 runs

    Benchmark 2: rename remote (refformat = reftable, revision = HEAD)
      Time (mean ± σ):      1.177 s ±  0.103 s    [User: 0.446 s, System: 0.270 s]
      Range (min … max):    1.023 s …  1.410 s    10 runs

    Summary
      rename remote (refformat = reftable, revision = HEAD) ran
        7.31 ± 0.79 times faster than rename remote (refformat = reftable, revision = HEAD~)

There is one issue though with using atomic transactions: when nesting a
remote into itself it can happen that renamed references conflict with
the old referencse. For example, when we have a reference
"refs/remotes/origin/foo" and we rename "origin" to "origin/foo", then
we'll end up with an F/D conflict when we try to create the renamed
reference "refs/remotes/origin/foo/foo".

This situation is overall quite unlikely to happen: people tend to not
use nested remotes, and if they do they must at the same time also have
a conflicting refname. But the end result would be that the old remote
references stay intact whereas all the other parts of the repository
have been adjusted for the new remote name.

Address this by queueing and preparing the reference update before we
touch any other part of the repository. Like this we can make sure that
the reference update will go through before rewriting the configuration.
Otherwise, if the transaction fails to prepare we can gracefully abort
the whole operation without any changes having been performed in the
repository yet. Furthermore, we can detect the conflict and print some
helpful advice for how the user can resolve this situation. So overall,
the tradeoff is that:

  - Reference transactions are now all-or-nothing. This is a significant
    improvement over the previous state where we may have ended up with
    partially-renamed references.

  - Rewriting references is now significantly faster.

  - We only rewrite the configuration in case we know that all
    references can be updated.

  - But we may refuse to rename a remote in case references conflict.

Overall this seems like an acceptable tradeoff.

While at it, fix the handling of symbolic/broken references by using
`refs_for_each_rawref()`. Add tests that cover both this reported issue
and tests that exercise nesting of remotes.

One thing to note: with this change we cannot provide a proper progress
monitor anymore as we queue the references into the transactions as we
iterate through them. Consequently, as we don't know yet how many refs
there are in total, we cannot report how many percent of the operation
is done anymore. But that's a small price to pay considering that you
now shouldn't need the progress monitor in most situations at all
anymore.

[1]: <CANrWfmQWa=RJnm7d3C7ogRX6Tth2eeuGwvwrNmzS2gr+eP0OpA@mail.gmail.com>

Reported-by: Han Jiang <jhcarl0814@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 14:19:30 -07:00
Patrick Steinhardt
08e6a7add4 builtin/remote: determine whether refs need renaming early on
When renaming a remote we may have to also rename remote refs in case
the refspec changes. Pull out this computation into a separate loop.
While that seems nonsensical right now, it'll help us in a subsequent
commit where we will prepare the reference transaction before we rewrite
the configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 14:19:30 -07:00
Patrick Steinhardt
376d7f1a11 builtin/remote: fix sign comparison warnings
Fix -Wsign-comparison warnings. All of the warnings we have are about
mismatches in signedness for loop counters. These are trivially fixable
by using the correct integer type.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 14:19:30 -07:00
Patrick Steinhardt
b9fd73a234 refs: pass refname when invoking reflog entry callback
With `refs_for_each_reflog_ent()` callers can iterate through all the
reflog entries for a given reference. The callback that is being invoked
for each such entry does not receive the name of the reference that we
are currently iterating through. This isn't really a limiting factor, as
callers can simply pass the name via the callback data.

But this layout sometimes does make for a bit of an awkward calling
pattern. One example: when iterating through all reflogs, and for each
reflog we iterate through all refnames, we have to do some extra book
keeping to track which reference name we are currently yielding reflog
entries for.

Change the signature of the callback function so that the reference name
of the reflog gets passed through to it. Adapt callers accordingly and
start using the new parameter in trivial cases. The next commit will
refactor the reference migration logic to make use of this parameter so
that we can simplify its logic a bit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 14:19:30 -07:00
Junio C Hamano
cf03815537 Merge branch 'ps/reflog-migrate-fixes' into ps/remote-rename-fix
* ps/reflog-migrate-fixes:
  refs: fix invalid old object IDs when migrating reflogs
  refs: stop unsetting REF_HAVE_OLD for log-only updates
  refs/files: detect race when generating reflog entry for HEAD
  refs: fix identity for migrated reflogs
  ident: fix type of string length parameter
  builtin/reflog: implement subcommand to write new entries
  refs: export `ref_transaction_update_reflog()`
  builtin/reflog: improve grouping of subcommands
  Documentation/git-reflog: convert to use synopsis type
2025-08-06 14:18:43 -07:00
Patrick Steinhardt
7aa619c36f builtin/reflog: implement subcommand to write new entries
While we provide a couple of subcommands in git-reflog(1) to remove
reflog entries, we don't provide any to write new entries. Obviously
this is not an operation that really would be needed for many use cases
out there, or otherwise people would have complained that such a command
does not exist yet. But the introduction of the "reftable" backend
changes the picture a bit, as it is now basically impossible to manually
append a reflog entry if one wanted to do so due to the binary format.

Plug this gap by introducing a simple "write" subcommand. For now, all
this command does is to append a single new reflog entry with the given
object IDs and message to the reflog. More specifically, it is not yet
possible to:

  - Write multiple reflog entries at once.

  - Insert reflog entries at arbitrary indices.

  - Specify the date of the reflog entry.

  - Insert reflog entries that refer to nonexistent objects.

If required, those features can be added at a future point in time. For
now though, the new command aims to fulfill the most basic use cases
while being as strict as possible when it comes to verifying parameters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 07:36:30 -07:00
Patrick Steinhardt
649c7bb77a builtin/reflog: improve grouping of subcommands
The way subcommands of git-reflog(1) are laid out does not make any
immediate sense. Reorder them such that read-only subcommands precede
writing commands for a bit more structure.

Furthermore, move the "expire" subcommand last. This prepares for a
subsequent change where we are about to introduce a new "write" command
to append reflog entries. Like this, the writing subcommands are ordered
such that those affecting a single reflog come before those spanning
across all reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06 07:36:29 -07:00
Junio C Hamano
8982c5e909 Merge branch 'kj/renamed-submodule'
The case where a new submodule takes a path where used to be a
completely different subproject is now dealt a bit better than
before.

* kj/renamed-submodule:
  fixup! submodule: skip redundant active entries when pattern covers path
  fixup! submodule: prevent overwriting .gitmodules on path reuse
  submodule: skip redundant active entries when pattern covers path
  submodule: prevent overwriting .gitmodules on path reuse
2025-08-05 11:53:56 -07:00