global/git

mirror of https://github.com/git/git.git synced 2026-03-18 12:40:05 +01:00

Author	SHA1	Message	Date
Junio C Hamano	d5518d52b2	Merge branch 'tc/last-modified-recursive-fix' "git last-modified" operating in non-recursive mode used to trigger a BUG(), which has been corrected. * tc/last-modified-recursive-fix: last-modified: fix bug when some paths remain unhandled	2025-09-29 11:40:35 -07:00
Junio C Hamano	96ed0a8906	Merge branch 'kn/refs-files-case-insensitive' Deal more gracefully with directory / file conflicts when the files backend is used for ref storage, by failing only the ones that are involved in the conflict while allowing others. * kn/refs-files-case-insensitive: refs/files: handle D/F conflicts during locking refs/files: handle F/D conflicts in case-insensitive FS refs/files: use correct error type when lock exists refs/files: catch conflicts on case-insensitive file-systems	2025-09-29 11:40:35 -07:00
Junio C Hamano	a89fa2fff2	Merge branch 'jk/color-variable-fixes' Some places in the code confused a variable that is not a boolean to enable color but is an enum that records what the user requested to do about color. A couple of bugs of this sort have been fixed, while the code has been cleaned up to prevent similar bugs in the future. * jk/color-variable-fixes: config: store want_color() result in a separate bool add-interactive: retain colorbool values longer color: return bool from want_color() color: use git_colorbool enum type to store colorbools pretty: use format_commit_context.auto_color as colorbool diff: stop passing ecbdata->use_color as boolean diff: pass o->use_color directly to fill_metainfo() diff: don't use diff_options.use_color as a strict bool diff: simplify color_moved check when flushing grep: don't treat grep_opt.color as a strict bool color: return enum from git_config_colorbool() color: use GIT_COLOR_* instead of numeric constants	2025-09-29 11:40:35 -07:00
Junio C Hamano	a5d4779e6e	Merge branch 'dk/stash-apply-index' The stash.index configuration variable can be set to make "git stash pop/apply" pretend that it was invoked with "--index". * dk/stash-apply-index: stash: honor stash.index in apply, pop modes stash: refactor private config globals t3905: remove unneeded blank line t3903: reduce dependencies on previous tests	2025-09-29 11:40:35 -07:00
Junio C Hamano	4bac57bc67	Merge branch 'jk/setup-revisions-freefix' There are double frees and leaks around setup_revisions() API used in "git stash show", which has been fixed, and setup_revisions() API gained a wrapper to make it more ergonomic when using it with strvec-manged argc/argv pairs. * jk/setup-revisions-freefix: revision: retain argv NULL invariant in setup_revisions() treewide: pass strvecs around for setup_revisions_from_strvec() treewide: use setup_revisions_from_strvec() when we have a strvec revision: add wrapper to setup_revisions() from a strvec revision: manage memory ownership of argv in setup_revisions() stash: tell setup_revisions() to free our allocated strings	2025-09-29 11:40:34 -07:00
Junio C Hamano	7c15d990cc	Merge branch 'rs/get-oid-with-flags-cleanup' Code clean-up. * rs/get-oid-with-flags-cleanup: use repo_get_oid_with_flags()	2025-09-23 11:53:40 -07:00
Junio C Hamano	2e8d7569ea	Merge branch 'jk/add-i-color' Some among "git add -p" and friends ignored color.diff and/or color.ui configuration variables, which is an old regression, which has been corrected. * jk/add-i-color: contrib/diff-highlight: mention interactive.diffFilter add-interactive: manually fall back color config to color.ui add-interactive: respect color.diff for diff coloring stash: pass --no-color to diff plumbing child processes	2025-09-23 11:53:40 -07:00
Jeff King	18068139f2	treewide: pass strvecs around for setup_revisions_from_strvec() The previous commit converted callers of setup_revisions() with a strvec to use the safer and easier _from_strvec() variant. Let's now convert spots that don't directly have a strvec, but receive an argc/argv pair that eventually comes from one. We'll instead pass the strvec down to the point where we call setup_revisions(). That makes these functions slightly less flexible if they were to grow other callers that don't use strvecs, but this rigidity is buying us some safety. It is only safe to pass the free_removed_argv_elements option to setup_revisions() if we know the elements of argv/argc are allocated on the heap. That isn't communicated in the type system when we are passed the bare elements. But if we get a strvec, we know that the elements are allocated strings. And at any rate, each of these modified functions has only a single caller (that has a strvec), so the loss of flexibility is unlikely to ever matter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-22 14:27:03 -07:00
Jeff King	b553332f82	treewide: use setup_revisions_from_strvec() when we have a strvec The previous commit introduced a wrapper to make using setup_revisions() with a strvec easier and safer. It converted spots that were already doing most of what the wrapper did. Let's now convert spots where we were not setting up the free_removed_argv_elements flag. As discussed in the previous commit, this probably isn't fixing any bugs or leaks (since these sites wouldn't trigger the re-shuffling of argv that causes them). This is mostly future-proofing us against setup_revisions() becoming more aggressive about its re-shuffling. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-22 14:27:03 -07:00
Jeff King	f93c1d86cc	revision: add wrapper to setup_revisions() from a strvec The setup_revisions() function was designed to take the argc/argv pair from the operating system. But we sometimes construct our own argv using a strvec and pass that in. There are a few gotchas that callers need to deal with here: 1. You should always pass the free_removed_argv_elements option via setup_revision_opt. Otherwise, entries may be leaked if setup_revisions() re-shuffles options. 2. After setup_revisions() returns, the strvec state is odd. We get a reduced argc from setup_revisions() telling us how many unknown options were left in place. Entries after that in argv may be retained, or may be NULL (depending on how the reshuffling happened). But the strvec's "nr" field still represents the original value, and some of the entries it thinks it is still storing may be NULL. Callers must be careful with how they access it. Some callers deal with (1), but not all. In practice they are OK because they do not pass any options that would cause setup_revisions() to re-shuffle (namely unknown options which may be relayed from the user, and the use of the "--" separator). But it's probably a good idea to consistently pass this option anyway to future-proof ourselves against the details of setup_revisions() changing. No callers address (2), though I don't think there any visible bugs. Most of them simply call strvec_clear() and never otherwise look at the result. And in fact, if they naively set foo.nr to the argc returned by setup_revisions(), that would cause leaks! Because setup_revisions() does not free consumed options[1], we have to leave the "nr" field of the strvec at its original value to find and free them during strvec_clear(). So I don't think there are any bugs to fix here, but we can make things safer and simpler for callers. Let's introduce a helper function that sets the free_removed_argv_elements automatically and shrinks the strvec to represent the retained options afterwards (taking care to free the now-obsolete entries). We'll start by converting all of the call-sites which use the free_removed_argv_elements option. There should be no behavior change for them, except that their "shrunken" entries are cleaned up immediately, rather than waiting for a strvec_clear() call. [1] Arguably setup_revisions() should be doing this step for us if we told it to free removed options, but there are many existing callers which will be broken if it did. Introducing this helper is a possible first step towards that. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-22 14:27:03 -07:00
Jeff King	3ea35c64b0	stash: tell setup_revisions() to free our allocated strings In "git stash show", we do a first pass of parsing our command line options by splitting them into revision args and stash args. These are stored in strvecs, and we pass the revision args to setup_revisions(). But setup_revisions() may modify the argv we pass it, causing us to leak some of the entries. In particular, if it sees a "--" string, that will be dropped from argv. This is the same as other cases addressed by `f92dbdbc6a` (revisions API: don't leak memory on argv elements that need free()-ing, 2022-08-02), and we should fix it the same way: by passing the free_removed_argv_elements option to setup_revisions(). The added test here is run only with SANITIZE=leak, without checking its output, because the behavior of stash with "--" is a little odd: 1. Running "git stash show" will show --stat output. But running "git stash show --" will show --patch. 2. I'd expect a non-option after "--" to be treated as a pathspec, so: git stash show -p 1 -- foo would look treat "1" as a stash (a synonym for stash@{1}) and restrict the resulting diff to "foo". But it doesn't. We split the revision/stash args without any regard to "--". So in the example above both "1" and "foo" are stashes. Which is an error, but also: git stash show -- foo treats "foo" as a stash, not a pathspec. These are both oddities that we may want to address (or may not, if we want to retain historical quirks). But they are well outside the scope of this patch. So for now we'll just let the tests confirm we aren't leaking without otherwise expecting any behavior. If we later address either of those points and end up with another test that covers "stash show --", we can drop this leak-only test. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-22 14:24:52 -07:00
D. Ben Knoble	9842c0c749	stash: honor stash.index in apply, pop modes With stash.index=true, git-stash(1) command now tries to reinstate the index by default in the "apply" and "pop" modes. Not doing so creates a common trap [1], [2]: "git stash apply" is not the reverse of "git stash push" because carefully staged indices are lost and have to be manually recreated. OTOH, this mode is not always desirable and may create more conflicts when applying stashes. As usual, "--no-index" will disable this behavior if you set "stash.index". [1]: https://lore.kernel.org/git/CAPx1GvcxyDDQmCssMjEnt6JoV6qPc5ZUpgPLX3mpUC_4PNYA1w@mail.gmail.com/ [2]: https://lore.kernel.org/git/c5a811ac-8cd3-c389-ac6d-29020a648c87@gmail.com/ Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-21 20:23:23 -07:00
D. Ben Knoble	88b5b8d886	stash: refactor private config globals A subsequent commit will access a new config variable in the stash subcommand implementations, which requires the variables to be declared before the relevant functions. Prep with a pure refactoring change to consolidate config-related globals with the rest of the globals. Best-viewed-with: --color-moved Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-21 20:23:23 -07:00
Junio C Hamano	7b776bc308	Merge branch 'pc/range-diff-memory-limit' "git range-diff" learned a way to limit the memory consumed by O(NN) cost matrix. pc/range-diff-memory-limit: range-diff: add configurable memory limit for cost matrix	2025-09-18 10:07:02 -07:00
Junio C Hamano	1fbfabfa71	Merge branch 'pw/3.0-commentchar-auto-deprecation' "core.commentChar=auto" that attempts to dynamically pick a suitable comment character is non-workable, as it is too much trouble to support for little benefit, and is marked as deprecated. * pw/3.0-commentchar-auto-deprecation: commit: print advice when core.commentString=auto config: warn on core.commentString=auto breaking-changes: deprecate support for core.commentString=auto	2025-09-18 10:07:00 -07:00
Toon Claes	e6c06e87a2	last-modified: fix bug when some paths remain unhandled The recently introduced new subcommand git-last-modified(1) runs into an error in some scenarios. It then would exit with the message: BUG: paths remaining beyond boundary in last-modified This seems to happens for example when criss-cross merges are involved. In that scenario, the function diff_tree_combined() gets called. The function diff_tree_combined() copies the `struct diff_options` from the input `struct rev_info` to override some flags. One flag is `recursive`, which is always set to 1. This has been the case since the inception of this function in `af3feefa1d` (diff-tree -c: show a merge commit a bit more sensibly., 2006-01-24). This behavior is incompatible with git-last-modified(1), when called non-recursive (which is the default). The last-modified machinery uses a hashmap for all the paths it wants to get the last-modified commit for. Through log_tree_commit() the callback mark_path() is called. The diff machinery uses diff_tree_combined() internally, and due to it's recursive behavior the callback receives entries inside subtrees, but not the subtree entries themselves. So a directory is never expelled from the hashmap, and the BUG() statement gets hit. Because there are many callers calling into diff_tree_combined(), both directly and indirectly, we cannot simply change it's behavior. Instead, add a flag `no_recursive_diff_tree_combined` which supresses the behavior of diff_tree_combined() to override `recursive` and set this flag in builtin/last-modified.c. Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-18 08:00:41 -07:00
Karthik Nayak	3c07063231	refs/files: catch conflicts on case-insensitive file-systems During the 'prepare' phase of a reference transaction in the files backend, we create the lock files for references to be created. When using batched updates on case-insensitive filesystems, the entire batched updates would be aborted if there are conflicting names such as: refs/heads/Foo refs/heads/foo This affects all commands which were migrated to use batched updates in Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before that, reference updates would be applied serially with one transaction used per update. When users fetched multiple references on case-insensitive systems, subsequent references would simply overwrite any earlier references. So when fetching: refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6 refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 The user would simply end up with: refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 This is buggy behavior since the user is never informed about the overrides performed and missing references. Nevertheless, the user is left with a working repository with a subset of the references. Since Git 2.51, in such situations fetches would simply fail without updating any references. Which is also buggy behavior and worse off since the user is left without any references. The error is triggered in `lock_raw_ref()` where the files backend attempts to create a lock file. When a lock file already exists the function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens, the entire batched updates, not individual operation, is aborted as if it were in a transaction. Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to aid the batched update mechanism to simply reject such errors. The change only affects batched updates since batched updates will reject individual updates with non-generic errors. So specifically this would only affect: 1. git fetch 2. git receive-pack 3. git update-ref --batch-updates This bubbles the error type up to `files_transaction_prepare()` which tries to lock each reference update. So if the locking fails, we check if the rejection type can be ignored, which is done by calling `ref_transaction_maybe_set_rejected()`. As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT', the specific reference update would simply be rejected, while other updates in the transaction would continue to be applied. This allows partial application of references in case-insensitive filesystems when fetching colliding references. While the earlier implementation allowed the last reference to be applied overriding the initial references, this change would allow the first reference to be applied while rejecting consequent collisions. This should be an okay compromise since with the files backend, there is no scenario possible where we would retain all colliding references. Let's also be more proactive and notify users on case-insensitive filesystems about such problems by providing a brief about the issue while also recommending using the reftable backend, which doesn't have the same issue. Reported-by: Joe Drew <joe.drew@indexexchange.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-17 09:19:07 -07:00
Jeff King	69a7e8d32f	config: store want_color() result in a separate bool The "git config --get-colorbool foo.bar" command not only digs in the config to find the value of foo.bar, it evaluates the result using want_color() to check the tty-ness of stdout. But it stores the bool result of want_color() in the same git_colorbool that we found in the config. This works in practice because the git_colorbool enum is a superset of the bool values. But it is an oddity from a type system perspective. Let's instead store the result in a separate bool and use that. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-16 18:00:26 -07:00
Jeff King	e9330ae4b8	color: use git_colorbool enum type to store colorbools We traditionally used "int" to store and pass around the values defined by "enum git_colorbool" (which were originally just #define macros). Using an int doesn't produce incorrect results, but using the actual enum makes the intent of the code more clear. It would be nice if the compiler could catch cases where we used the enum and an int interchangeably, since it's very easy to accidentally check the boolean true/false of a colorbool like: if (branch_use_color) This is wrong because GIT_COLOR_UNKNOWN and GIT_COLOR_AUTO evaluate to true in C, even though we may ultimately decide not to use color. But C is pretty happy to convert between ints and enums (even with various -Wenum-* warnings). So this sadly doesn't protect us from such mistakes, but it hopefully does make the code easier to read. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-16 17:59:53 -07:00
Jeff King	3c3e9b8303	color: use GIT_COLOR_* instead of numeric constants Long ago Git's decision to show color for a subsytem was stored in a tri-state variable: it could be true (1), false (0), or unknown (-1). But since `daa0c3d971` (color: delay auto-color decision until point of use, 2011-08-17) we want to carry around a new state, "auto", which bases the decision on the tty-ness of stdout (rather than collapsing that "auto" state to a true/false immediately). That commit introduced a set of GIT_COLOR_* defines to represent each state: UNKNOWN, ALWAYS, NEVER, and AUTO. But it only used the AUTO value, and left alone code using bare 0/1/-1 values. And of course since then we've grown many new spots that use those bare values. Let's switch all of these to use the named constants. That should make the code a bit easier to read, as it is more obvious that we're representing a color decision. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-16 13:37:03 -07:00
Junio C Hamano	13d1e86888	Merge branch 'lo/repo-info-step-2' "repo info" learns a short-hand option "-z" that is the same as "--format=nul", and learns to report the objects format used in the repository. * lo/repo-info-step-2: repo: add the field objects.format repo: add the flag -z as an alias for --format=nul	2025-09-15 08:52:05 -07:00
Junio C Hamano	7d00521d7b	Merge branch 'jt/de-global-bulk-checkin' The bulk-checkin code used to depend on a file-scope static singleton variable, which has been updated to pass an instance throughout the callchain. * jt/de-global-bulk-checkin: bulk-checkin: use repository variable from transaction bulk-checkin: require transaction for index_blob_bulk_checkin() bulk-checkin: remove global transaction state bulk-checkin: introduce object database transaction structure	2025-09-15 08:52:05 -07:00
Junio C Hamano	da3799a67b	Merge branch 'rs/describe-with-lazy-queue-and-oidset' Instead of scanning for the remaining items to see if there are still commits to be explored in the queue, use khash to remember which items are still on the queue (an unacceptable alternative is to reserve one object flag bits). * rs/describe-with-lazy-queue-and-oidset: describe: use oidset in finish_depth_computation()	2025-09-12 10:41:21 -07:00
Junio C Hamano	07f29476de	Merge branch 'ms/refs-exists' "git refs exists" that works like "git show-ref --exists" has been added. * ms/refs-exists: t: add test for git refs exists subcommand t1422: refactor tests to be shareable t1403: split 'show-ref --exists' tests into a separate file builtin/refs: add 'exists' subcommand	2025-09-12 10:41:19 -07:00
Junio C Hamano	c31a276f12	Merge branch 'ps/object-store-midx-dedup-info' Further code clean-up for multi-pack-index code paths. * ps/object-store-midx-dedup-info: midx: compute paths via their source midx: stop duplicating info redundant with its owning source midx: write multi-pack indices via their source midx: load multi-pack indices via their source midx: drop redundant `struct repository` parameter odb: simplify calling `link_alt_odb_entry()` odb: return newly created in-memory sources odb: consistently use "dir" to refer to alternate's directory odb: allow `odb_find_source()` to fail odb: store locality in object database sources	2025-09-12 10:41:18 -07:00
René Scharfe	a66fc22bf9	use repo_get_oid_with_flags() get_oid_with_context() allows specifying flags and reports object details via a passed-in struct object_context. Some callers just want to specify flags, but don't need any details back. Convert them to repo_get_oid_with_flags(), which provides just that and frees them from dealing with the context structure. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-10 14:29:49 -07:00
Junio C Hamano	95a8428323	Merge branch 'tc/last-modified' A new command "git last-modified" has been added to show the closest ancestor commit that touched each path. * tc/last-modified: last-modified: use Bloom filters when available t/perf: add last-modified perf script last-modified: new subcommand to show when files were last modified	2025-09-08 14:54:35 -07:00
Junio C Hamano	576e0b6eb3	Merge branch 'ds/ls-files-lazy-unsparse' "git ls-files <pathspec>..." should not necessarily have to expand the index fully if a sparsified directory is excluded by the pathspec; the code is taught to expand the index on demand to avoid this. * ds/ls-files-lazy-unsparse: ls-files: conditionally leave index sparse	2025-09-08 14:54:35 -07:00
Jeff King	89b4183efe	stash: pass --no-color to diff plumbing child processes After a partial stash, we may clear out the working tree by capturing the output of diff-tree and piping it into git-apply (and likewise we may use diff-index to restore the index). So we most definitely do not want color diff output from that diff-tree process. And it normally would not produce any, since its stdout is not going to a tty, and the default value of color.ui is "auto". However, if GIT_PAGER_IN_USE is set in the environment, that overrides the tty check, and we'll produce a colorized diff that chokes git-apply: $ echo y \| GIT_PAGER_IN_USE=1 git stash -p [...] Saved working directory and index state WIP on main: 4f2e2bb foo error: No valid patches in input (allow with "--allow-empty") Cannot remove worktree changes Setting this variable is a relatively silly thing to do, and not something most users would run into. But we sometimes do it in our tests to stimulate color. And it is a user-visible bug, so let's fix it rather than work around it in the tests. The root issue here is that diff-tree (and other diff plumbing) should probably not ever produce color by default. It does so not by parsing color.ui, but because of the baked-in "auto" default from `4c7f1819b3` (make color.ui default to 'auto', 2013-06-10). But changing that is risky; we've had discussions back and forth on the topic over the years. E.g.: https://lore.kernel.org/git/86D0A377-8AFD-460D-A90E-6327C6934DFC@gmail.com/. So let's accept that as the status quo for now and protect ourselves by passing --no-color to the child processes. This is the same thing we did for add-interactive itself in `1c6ffb546b` (add--interactive.perl: specify --no-color explicitly, 2020-09-07). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-08 14:00:32 -07:00
Lucas Seiki Oshiro	c2e3713334	repo: add the field objects.format The flag `--show-object-format` from git-rev-parse is used for retrieving the object storage format. This way, it is used for querying repository metadata, fitting in the purpose of git-repo-info. Add a new field `objects.format` to the git-repo-info subcommand containing that information. Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-04 11:36:40 -07:00
Lucas Seiki Oshiro	a92f5ca0d5	repo: add the flag -z as an alias for --format=nul Other Git commands that have nul-terminated output (e.g. git-config, git-status, git-ls-files) have a flag `-z` for using the null character as the record separator. Add the `-z` flag to git-repo-info as an alias for `--format=nul`, making it consistent with the behavior of the other commands. Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-04 11:36:39 -07:00
René Scharfe	30598ccc4d	describe: use oidset in finish_depth_computation() Depth computation can end early if all remaining commits are flagged. The current code determines if that's the case by checking all queue items each time it dequeues a flagged commit. This can cause quadratic complexity. We could simply count the flagged items in the queue and then update that number as we add and remove items. That would provide a general speedup, but leave one case where we have to scan the whole queue: When we flag a previously seen, but unflagged commit. It could be on the queue and then we'd have to decrease our count. We could dedicate an object flag to track queue membership, but that would leave less for candidate tags, affecting the results. So use a hash table, specifically an oidset of commit hashes, to track that. This avoids quadratic behaviour in all cases and provides a nice performance boost over the previous commit, `08bb69d70f` (describe: use prio_queue_replace(), 2025-08-03): Benchmark 1: ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 855.3 ms ± 1.3 ms [User: 790.8 ms, System: 49.9 ms] Range (min … max): 853.7 ms … 857.8 ms 10 runs Benchmark 2: ./git describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 610.8 ms ± 1.7 ms [User: 546.9 ms, System: 49.3 ms] Range (min … max): 608.9 ms … 613.3 ms 10 runs Summary ./git describe $(git rev-list v2.41.0..v2.47.0) ran 1.40 ± 0.00 times faster than ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0) Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-02 15:15:13 -07:00
Meet Soni	0f0a8a11c0	builtin/refs: add 'exists' subcommand As part of the ongoing effort to consolidate reference handling, introduce a new `exists` subcommand. This command provides the same functionality and exit-code behavior as `git show-ref --exists`, serving as its modern replacement. The logic for `show-ref --exists` is minimal. Rather than creating a shared helper function which would be overkill for ~20 lines of code, its implementation is intentionally duplicated here. This contrasts with `git refs list`, where sharing the larger implementation of `for-each-ref` was necessary. Documentation for the new subcommand is also added to the `git-refs(1)` man page. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-02 09:58:35 -07:00
Paulo Casaretto	00727249ec	range-diff: add configurable memory limit for cost matrix When comparing large commit ranges (e.g., 250,000+ commits), range-diff attempts to allocate an n×n cost matrix that can exhaust available memory. For example, with 256,784 commits (n = 513,568), the matrix would require approximately 256GB of memory (513,568² × 4 bytes), causing either immediate segmentation faults due to integer overflow or system hangs. Add a memory limit check in get_correspondences() before allocating the cost matrix. This check uses the total size in bytes (n² × sizeof(int)) and compares it against a configurable maximum, preventing both excessive memory usage and integer overflow issues. The limit is configurable via a new --max-memory option that accepts human-readable sizes (e.g., "1G", "500M"). The default is 4GB for 64 bit systems and 2GB for 32 bit systems. This allows comparing ranges of approximately 32,000 (16,000) commits - generous for real-world use cases while preventing impractical operations. When the limit is exceeded, range-diff now displays a clear error message showing both the requested memory size and the maximum allowed, formatted in human-readable units for better user experience. Example usage: git range-diff --max-memory=1G branch1...branch2 git range-diff --max-memory=500M base..topic1 base..topic2 This approach was chosen over alternatives: - Pre-counting commits: Would require spawning additional git processes and reading all commits twice - Limiting by commit count: Less precise than actual memory usage - Streaming approach: Would require significant refactoring of the current algorithm This issue was previously discussed in: https://lore.kernel.org/git/RFC-cover-v2-0.5-00000000000-20211210T122901Z-avarab@gmail.com/ Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Paulo Casaretto <pcasaretto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-29 09:46:07 -07:00
Junio C Hamano	c64ec662d0	Merge branch 'jk/describe-blob' "git describe <blob>" misbehaves and/or crashes in some corner cases, which has been taught to exit with failure gracefully. * jk/describe-blob: describe: pass commit to describe_commit() describe: handle blob traversal with no commits describe: catch unborn branch in describe_blob() describe: error if blob not found describe: pass oid struct by const pointer	2025-08-29 09:44:38 -07:00
Toon Claes	8d9a7cdfda	last-modified: use Bloom filters when available Our 'git last-modified' performs a revision walk, and computes a diff at each point in the walk to figure out whether a given revision changed any of the paths it considers interesting. When changed-path Bloom filters are available, we can avoid computing many such diffs. Before computing a diff, we first check if any of the remaining paths of interest were possibly changed at a given commit by consulting its Bloom filter. If any of them are, we are resigned to compute the diff. If none of those queries returned "maybe", we know that the given commit doesn't contain any changed paths which are interesting to us. So, we can avoid computing it in this case. Comparing the perf test results on git.git: Test HEAD~ HEAD ------------------------------------------------------------------------------------ 8020.1: top-level last-modified 4.49(4.34+0.11) 2.22(2.05+0.09) -50.6% 8020.2: top-level recursive last-modified 5.64(5.45+0.11) 5.62(5.30+0.11) -0.4% 8020.3: subdir last-modified 0.11(0.06+0.04) 0.07(0.03+0.04) -36.4% Based-on-patch-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-28 16:44:58 -07:00
Toon Claes	32f74582bc	last-modified: new subcommand to show when files were last modified Similar to git-blame(1), introduce a new subcommand git-last-modified(1). This command shows the most recent modification to paths in a tree. It does so by expanding the tree at a given commit, taking note of the current state of each path, and then walking backwards through history looking for commits where each path changed into its final commit ID. Based-on-patch-by: Jeff King <peff@peff.net> Improved-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-28 16:44:58 -07:00
Derrick Stolee	681f26bccc	ls-files: conditionally leave index sparse When running 'git ls-files' with a pathspec, the index entries get filtered according to that pathspec before iterating over them in show_files(). In `78087097b8` (ls-files: add --sparse option, 2021-12-22), this iteration was prefixed with a check for the '--sparse' option which allows the command to output directory entries; this created a pre-loop call to ensure_full_index(). However, when a user runs 'git ls-files' where the pathspec matches directories that are recursively matched in the sparse-checkout, there are not any sparse directories that match the pathspec so they would not be written to the output. The expansion in this case is just a performance drop for no behavior difference. Replace this global check to expand the index with a check inside the loop for a matched sparse directory. If we see one, then expand the index and continue from the current location. This is safe since the previous entries in the index did not have any sparse directories and thus would remain stable in this expansion. A test in t1092 confirms that this changes the behavior. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-28 08:02:37 -07:00
Phillip Wood	a0e6aaea7d	config: warn on core.commentString=auto As support for this setting was deprecated in the last commit print a warning (or die when WITH_BREAKING_CHANGES is enabled) if it is set. Avoid bombarding the user with warnings by only printing it (a) when running commands that call "git commit" and (b) only once per command. Some scaffolding is added to repo_read_config() to allow it to detect deprecated config settings and warn about them. As both "core.commentChar" and "core.commentString" set the comment character we record which one of them is used and tailor the warning message appropriately. Note the odd combination of die_message() followed by die(NULL) is to allow the next commit to insert a call to advise() in the middle. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-26 08:52:44 -07:00
Phillip Wood	fdae4114a6	breaking-changes: deprecate support for core.commentString=auto When "core.commentString" is set to "auto" then "git commit" will automatically select the comment character ensuring that it is not the first character on any of the lines in the commit message. This was introduced by commit `84c9dc2c5a` (commit: allow core.commentChar=auto for character auto selection, 2014-05-17). The motivation seems to be to avoid commenting out lines from the existing message when amending a commit that was created with a message from a file. Unfortunately this feature does not work with: * commit message templates that contain comments. * prepare-commit-msg hooks that introduce comments. * "git commit --cleanup=strip --edit -F <file>" which means that it is incompatible with - the "fixup" and "squash" commands of "git rebase -i" as the comments added by those commands are then treated as part of the commit message. - the conflict comments added to the commit message by "git cherry-pick", "git rebase" etc. as these comments are then treated as part of the commit message. It is also ignored by "git notes" when amending a note. The issues with comments coming from a template, hook or file are a consequence of the design of this feature and are therefore hard to fix. As the costs of this feature outweigh the benefits, deprecate it and remove it in Git 3.0. If someone comes up with some patches that fix all the issues in a maintainable way then I'd be happy to see this change reverted. The next commits will add a warning and some advice for users on how they can update their config settings. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-26 08:47:37 -07:00
Junio C Hamano	ebb45da976	Merge branch 'lo/repo-info' A new subcommand "git repo" gives users a way to grab various repository characteristics. * lo/repo-info: repo: add the --format flag repo: add the field layout.shallow repo: add the field layout.bare repo: add the field references.format repo: declare the repo command	2025-08-25 14:22:04 -07:00
Junio C Hamano	eed447dd95	Merge branch 'ps/commit-graph-wo-globals' Remove dependency on the_repository and other globals from the commit-graph code, and other changes unrelated to de-globaling. * ps/commit-graph-wo-globals: commit-graph: stop passing in redundant repository commit-graph: stop using `the_repository` commit-graph: stop using `the_hash_algo` commit-graph: refactor `parse_commit_graph()` to take a repository commit-graph: store the hash algorithm instead of its length commit-graph: stop using `the_hash_algo` via macros	2025-08-25 14:22:03 -07:00
Junio C Hamano	a3c6459ab6	Merge branch 'dk/help-all' "git cmd --help-all" now works outside repositories. * dk/help-all: builtin: also setup gently for --help-all parse-options: refactor flags for usage_with_options_internal	2025-08-25 14:22:00 -07:00
Justin Tobler	b336144725	bulk-checkin: remove global transaction state Object database transactions in the bulk-checkin subsystem rely on global state to track transaction status. Stop relying on global state and instead store the transaction in the `struct object_database`. Functions that operate on transactions are updated to now wire transaction state. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-25 09:48:13 -07:00
Junio C Hamano	9d6e319ec5	Merge branch 'ac/deglobal-fmt-merge-log-config' Code clean-up. * ac/deglobal-fmt-merge-log-config: builtin/fmt-merge-msg: stop depending on 'the_repository' environment: remove the global variable 'merge_log_config'	2025-08-22 13:13:21 -07:00
Junio C Hamano	72e4eb56f0	Merge branch 'jc/diff-no-index-in-subdir' "git diff --no-index" run inside a subdirectory under control of a Git repository operated at the top of the working tree and stripped the prefix from the output, and oddballs like "-" (stdin) did not work correctly because of it. Correct the set-up by undoing what the set-up sequence did to cwd and prefix. * jc/diff-no-index-in-subdir: diff: --no-index should ignore the worktree	2025-08-22 13:13:21 -07:00
Junio C Hamano	c72d5bbf49	Merge branch 'ms/refs-list' The "list" subcommand of "git refs" acts as a front-end for "git for-each-ref". * ms/refs-list: t: add test for git refs list subcommand t6300: refactor tests to be shareable builtin/refs: add list subcommand builtin/for-each-ref: factor out core logic into a helper builtin/for-each-ref: align usage string with the man page doc: factor out common option	2025-08-22 13:13:20 -07:00
Junio C Hamano	54fef16542	Merge branch 'jc/strbuf-split' Arrays of strbuf is often a wrong data structure to use, and strbuf_split() family of functions that create them often have better alternatives. Update several code paths and replace strbuf_split(). * jc/strbuf-split: trace2: do not use strbuf_split() trace2: trim_trailing_newline followed by trim is a no-op sub-process: do not use strbuf_split() environment: do not use strbuf_split() config: do not use strbuf_split() notes: do not use strbuf_split() merge-tree: do not use strbuf_split() clean: do not use strbuf_split() [part 2] clean: do not pass the whole structure when it is not necessary clean: do not use strbuf_split() [part 1] clean: do not pass strbuf by value wt-status: avoid strbuf_split()	2025-08-21 13:47:00 -07:00
Junio C Hamano	971ba42dd4	Merge branch 'jc/string-list-split' string_list_split() family of functions have been extended to simplify common use cases. jc/string-list-split: string-list: split-then-remove-empty can be done while splitting string-list: optionally omit empty string pieces in string_list_split() diff: simplify parsing of diff.colormovedws string-list: optionally trim string pieces split by string_list_split() string-list: unify string_list_split* functions string-list: align string_list_split() with its _in_place() counterpart string-list: report programming error with BUG	2025-08-21 13:46:59 -07:00
Junio C Hamano	5a404a70c7	Merge branch 'rs/describe-with-prio-queue' "git describe" has been optimized by using better data structure. * rs/describe-with-prio-queue: describe: use prio_queue_replace() describe: use prio_queue	2025-08-21 13:46:59 -07:00

1 2 3 4 5 ...

13031 Commits