Commit Graph

12720 Commits

Author SHA1 Message Date
Karthik Nayak
0e358de64a fetch: use batched reference updates
The reference updates performed as a part of 'git-fetch(1)', take place
one at a time. For each reference update, a new transaction is created
and committed. This is necessary to ensure we can allow individual
updates to fail without failing the entire command. The command also
supports an '--atomic' mode, which uses a single transaction to update
all of the references. But this mode has an all-or-nothing approach,
where if a single update fails, all updates would fail.

In 23fc8e4f61 (refs: implement batch reference update support,
2025-04-08), we introduced a new mechanism to batch reference updates.
Under the hood, this uses a single transaction to perform a batch of
reference updates, while allowing only individual updates to fail.
Utilize this newly introduced batch update mechanism in 'git-fetch(1)'.
This provides a significant bump in performance, especially when dealing
with repositories with large number of references.

Adding support for batched updates is simply modifying the flow to also
create a batch update transaction in the non-atomic flow.

With the reftable backend there is a 22x performance improvement, when
performing 'git-fetch(1)' with 10000 refs:

  Benchmark 1: fetch: many refs (refformat = reftable, refcount = 10000, revision = master)
    Time (mean ± σ):      3.403 s ±  0.775 s    [User: 1.875 s, System: 1.417 s]
    Range (min … max):    2.454 s …  4.529 s    10 runs

  Benchmark 2: fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD)
    Time (mean ± σ):     154.3 ms ±  17.6 ms    [User: 102.5 ms, System: 56.1 ms]
    Range (min … max):   145.2 ms … 220.5 ms    18 runs

  Summary
    fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran
     22.06 ± 5.62 times faster than fetch: many refs (refformat = reftable, refcount = 10000, revision = master)

In similar conditions, the files backend sees a 1.25x performance
improvement:

  Benchmark 1: fetch: many refs (refformat = files, refcount = 10000, revision = master)
    Time (mean ± σ):     605.5 ms ±   9.4 ms    [User: 117.8 ms, System: 483.3 ms]
    Range (min … max):   595.6 ms … 621.5 ms    10 runs

  Benchmark 2: fetch: many refs (refformat = files, refcount = 10000, revision = HEAD)
    Time (mean ± σ):     485.8 ms ±   4.3 ms    [User: 91.1 ms, System: 396.7 ms]
    Range (min … max):   477.6 ms … 494.3 ms    10 runs

  Summary
    fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) ran
      1.25 ± 0.02 times faster than fetch: many refs (refformat = files, refcount = 10000, revision = master)

With this we'll either be using a regular transaction or a batch update
transaction. This helps cleanup some code which is no longer needed as
we'll now always have some type of 'ref_transaction' object being
propagated.

One big change is that earlier, each individual update would propagate a
failure. Whereas now, the `ref_transaction_for_each_rejected_update`
function is called at the end of the flow to capture the exit status for
'git-fetch(1)' and also to print F/D conflict errors. This does change
the order of the errors being printed, but the behavior stays the same.

Since transaction errors are now explicitly defined as part of
76e760b999 (refs: introduce enum-based transaction error types,
2025-04-08), utilize them and get rid of custom errors defined within
'builtin/fetch.c'.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19 11:06:31 -07:00
Karthik Nayak
b3de3832ce refs: add function to translate errors to strings
The commit 76e760b999 (refs: introduce enum-based transaction error
types, 2025-04-08) introduced enum-based transaction error types. The
refs transaction logic was also modified to propagate these errors. For
clients of the ref transaction system, it would be beneficial to provide
human readable messages for these errors.

There is already an existing mapping in 'builtin/update-ref.c', move it
to 'refs.c' as `ref_transaction_error_msg()` and use the same within the
'builtin/update-ref.c'.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19 11:06:31 -07:00
Junio C Hamano
6dbc41631d Merge branch 'ds/fix-thin-fix'
"git index-pack --fix-thin" used to abort to prevent a cycle in
delta chains from forming in a corner case even when there is no
such cycle.

* ds/fix-thin-fix:
  index-pack: allow revisiting REF_DELTA chains
  t5309: create failing test for 'git index-pack'
  test-tool: add pack-deltas helper
2025-05-12 14:22:49 -07:00
Junio C Hamano
bd99d6e8db Merge branch 'ps/object-store-cleanup'
Further code clean-up in the object-store layer.

* ps/object-store-cleanup:
  object-store: drop `repo_has_object_file()`
  treewide: convert users of `repo_has_object_file()` to `has_object()`
  object-store: allow fetching objects via `has_object()`
  object-store: move function declarations to their respective subsystems
  object-store: move and rename `odb_pack_keep()`
  object-store: drop `loose_object_path()`
  object-store: move `struct packed_git` into "packfile.h"
2025-05-12 14:22:49 -07:00
Junio C Hamano
0730906043 Merge branch 'ps/mv-contradiction-fix'
"git mv a a/b dst" would ask to move the directory 'a' itself, as
well as its contents, in a single destination directory, which is
a contradicting request that is impossible to satisfy. This case is
now detected and the command errors out.

* ps/mv-contradiction-fix:
  builtin/mv: convert assert(3p) into `BUG()`
  builtin/mv: bail out when trying to move child and its parent
2025-05-08 12:36:32 -07:00
Patrick Steinhardt
974f0d4664 builtin/mv: convert assert(3p) into BUG()
The use of asserts is discouraged in our codebase because they lead to
different behaviour depending on how Git is built. When being unsure
enough whether a condition always holds so that one adds the assert,
then the assert should probably trigger regardless of how Git is being
built.

Drop the call to assert(3p) in git-mv(1) and instead use `BUG()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-30 15:22:04 -07:00
Patrick Steinhardt
8583c9dcbc builtin/mv: bail out when trying to move child and its parent
We have a known issue in git-mv(1) where moving both a child and any of
its parents causes an assert to trigger because the child cannot be
found anymore in the index. We have added a test for this in commit
0fcd473fdd (t7001: add failure test which triggers assertion,
2024-10-22) without addressing the issue, which is why the test itself
is marked as `test_expect_failure`.

The behaviour of that test relies on a call to assert(3p) though, which
may or may not be compiled into the resulting binary depending on
whether or not we pass `-DNDEBUG`. When these asserts are compiled into
Git this may cause our CI to hang on Windows though, because asserts may
cause a modal window to be shown.

While we could work around the issue by converting this into a call to
`BUG()`, let's rather address the root cause of the issue by bailing out
in case we see that both a child and any of its parents are being moved
in the same command.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-30 15:05:15 -07:00
Junio C Hamano
0c9d6b7ced Merge branch 'jh/gc-launchctl-schedule-fix'
Fix for scheduled maintenance tasks on platforms using launchctl.

* jh/gc-launchctl-schedule-fix:
  maintenance: fix launchctl calendar intervals
2025-04-29 14:21:29 -07:00
Junio C Hamano
5a6de390d8 Merge branch 'az/tighten-string-array-constness'
Code clean-up.

* az/tighten-string-array-constness:
  global: mark usage strings and string tables const
2025-04-29 14:21:28 -07:00
Junio C Hamano
a501213402 Merge branch 'ua/call-repo-config-with-possibly-null-repository'
Since a call to repo_config() can be called with repo set to NULL
these days, a command that is marked as RUN_SETUP in the builtin
command table does not have to check repo with NULL before making
the call.

* ua/call-repo-config-with-possibly-null-repository:
  builtin/difftool: remove unnecessary if statement
  builtin/add: remove unnecessary if statement
2025-04-29 14:21:27 -07:00
Patrick Steinhardt
062b914c84 treewide: convert users of repo_has_object_file() to has_object()
As the comment of `repo_has_object_file()` and its `_with_flags()`
variant tells us, these functions are considered to be deprecated in
favor of `has_object()`. There are a couple of slight benefits in favor
of the replacement:

  - The new function has a short-and-sweet name.

  - More explicit defaults: `has_object()` doesn't fetch missing objects
    via promisor remotes, and neither does it reload packfiles if an
    object wasn't found by default. This ensures that it becomes
    immediately obvious when a simple object existence check may result
    in expensive actions.

Most importantly though, it is confusing that we have two sets of
functions that ultimately do the same thing, but with different
defaults.

Start sunsetting `repo_has_object_file()` and its `_with_flags()`
sibling by replacing all callsites with `has_object()`:

  - `repo_has_object_file(...)` is equivalent to
    `has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`.

  - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)`
    is equivalent to `has_object(..., 0)`.

  - `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)`
    is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`.

  - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`
    is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`.

The replacements should be functionally equivalent.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29 10:08:13 -07:00
Patrick Steinhardt
1a793261c5 object-store: move function declarations to their respective subsystems
We carry declarations for a couple of functions in "object-store.h" that
are not defined in "object-store.c", but in a different subsystem. Move
these declarations to the respective headers whose matching code files
carry the corresponding definition.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29 10:08:12 -07:00
Patrick Steinhardt
0b8ed25b66 object-store: move and rename odb_pack_keep()
The function `odb_pack_keep()` creates a file at the passed-in path. If
this fails, then the function re-tries by first creating any potentially
missing leading directories and then trying to create the file once
again. As such, this function doesn't host any kind of logic that is
specific to the object store, but is rather a generic helper function.

Rename the function to `safe_create_file_with_leading_directories()` and
move it into "path.c". While at it, refactor it so that it loses its
dependency on `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29 10:08:12 -07:00
Derrick Stolee
98f8854c94 index-pack: allow revisiting REF_DELTA chains
As detailed in the previous changes to t5309-pack-delta-cycles.sh, the
logic within 'git index-pack' to analyze an incoming thin packfile with
REF_DELTAs is suspect. The algorithm is overly cautious around delta
cycles, and that leads in fact to failing even when there is no cycle.

This change adjusts the algorithm to no longer fail in these cases. In
fact, these cycle cases will no longer fail but more importantly the
valid cases will no longer fail, either. The resulting packfile from the
--fix-thin operation will not have cycles either since REF_DELTAs are
forbidden from the on-disk format and OFS_DELTAs are impossible to write
as a cycle.

The crux of the matter is how the algorithm works when the REF_DELTAs
point to base objects that exist in the local repository. When reading
the thin packfile, the object IDs for the delta objects are unknown so
we do not have the delta chain structure automatically. Instead, we need
to start somewhere by selecting a delta whose base is inside our current
object database.

Consider the case where the packfile has two REF_DELTA objects, A and B,
and the delta chain looks like "A depends on B" and "B depends on C" for
some third object C, where C is already in the current repository. The
algorithm _should_ start with all objects that depend on C, finding B,
and then moving on to all objects depending on B, finding A.

However, if the repository also already has object B, then the delta
chain can be analyzed in a different order. The deltas with base B can
be analyzed first, finding A, and then the deltas with base C are
analyzed, finding B. The algorithm currently continues to look for
objects that depend on B, finding A again. This fails due to A's
'real_type' member already being overwritten from OBJ_REF_DELTA to the
correct object type.

This scenario is possible in a typical 'git fetch' where the client does
not advertise B as a 'have' but requests A as a 'want' (and C is noticed
as a common object based on other 'have's). The reason this isn't
typically seen is that most Git servers use OFS_DELTAs to represent
deltas within a packfile. However, if a server uses only REF_DELTAs,
then this kind of issue can occur. There is nothing in the explicit
packfile format that states this use of inter-pack REF_DELTA is
incorrect, only that REF_DELTAs should not be used in the on-disk
representation to avoid cycles.

This die() was introduced in ab791dd138 (index-pack: fix race condition
with duplicate bases, 2014-08-29). Several refactors have adjusted the
error message and the surrounding logic, but this issue has existed for
a longer time as that was only a conversion from an assert().

The tests in t5309 originated in 3b910d0c5e (add tests for indexing
packs with delta cycles, 2013-08-23) and b2ef3d9ebb (test index-pack on
packs with recoverable delta cycles, 2013-08-23). These changes make
note that the current behavior of handling "resolvable" cycles is mostly
a documentation-only test, not that this behavior is the best way for
Git to handle the situation.

The fix here is somewhat complicated due to the amount of state being
adjusted by the loop within threaded_second_pass(). Instead of trying to
resume the start of the loop while adjusting the necessary context, I
chose to scan the REF_DELTAs depending on the current 'parent' and skip
any that have already been processed. This necessarily leaves us in a
state where 'child' and 'child_obj' could be left as NULL and that must
be handled later. There is also some careful handling around skipping
REF_DELTAs when there are also OFS_DELTAs depending on that parent.
There may be value in extending 'test-tool pack-deltas' to allow writing
OFS_DELTAs in order to exercise this logic across the delta types.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-28 15:37:26 -07:00
Junio C Hamano
028c43269e Merge branch 'rj/build-tweaks'
Various build tweaks, including CSPRNG selection on some platforms.

* rj/build-tweaks:
  config.mak.uname: set CSPRNG_METHOD to getrandom on Linux
  config.mak.uname: add arc4random to the cygwin build
  config.mak.uname: add sysinfo() configuration for cygwin
  builtin/gc.c: correct RAM calculation when using sysinfo
  config.mak.uname: add clock_gettime() to the cygwin build
  config.mak.uname: add HAVE_GETDELIM to the cygwin section
  config.mak.uname: only set NO_REGEX on cygwin for v1.7
  config.mak.uname: add a note about NO_STRLCPY for Linux
  Makefile: remove NEEDS_LIBRT build variable
  meson.build: set default help format to html on windows
  meson.build: only set build variables for non-default values
  Makefile: only set some BASIC_CFLAGS when RUNTIME_PREFIX is set
  meson.build: remove -DCURL_DISABLE_TYPECHECK
2025-04-24 17:25:34 -07:00
Junio C Hamano
2bc5414c41 Merge branch 'ps/parse-options-integers'
Update parse-options API to catch mistakes to pass address of an
integral variable of a wrong type/size.

* ps/parse-options-integers:
  parse-options: detect mismatches in integer signedness
  parse-options: introduce precision handling for `OPTION_UNSIGNED`
  parse-options: introduce precision handling for `OPTION_INTEGER`
  parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()`
  parse-options: support unit factors in `OPT_INTEGER()`
  global: use designated initializers for options
  parse: fix off-by-one for minimum signed values
2025-04-24 17:25:34 -07:00
Junio C Hamano
36d8035d27 Merge branch 'ps/object-file-cleanup'
Code clean-up.

* ps/object-file-cleanup:
  object-store: merge "object-store-ll.h" and "object-store.h"
  object-store: remove global array of cached objects
  object: split out functions relating to object store subsystem
  object-file: drop `index_blob_stream()`
  object-file: split up concerns of `HASH_*` flags
  object-file: split out functions relating to object store subsystem
  object-file: move `xmmap()` into "wrapper.c"
  object-file: move `git_open_cloexec()` to "compat/open.c"
  object-file: move `safe_create_leading_directories()` into "path.c"
  object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-24 17:25:33 -07:00
Junio C Hamano
d61ff9c237 Merge branch 'ps/object-file-cleanup' into ps/object-store-cleanup
* ps/object-file-cleanup:
  object-store: merge "object-store-ll.h" and "object-store.h"
  object-store: remove global array of cached objects
  object: split out functions relating to object store subsystem
  object-file: drop `index_blob_stream()`
  object-file: split up concerns of `HASH_*` flags
  object-file: split out functions relating to object store subsystem
  object-file: move `xmmap()` into "wrapper.c"
  object-file: move `git_open_cloexec()` to "compat/open.c"
  object-file: move `safe_create_leading_directories()` into "path.c"
  object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-24 11:37:21 -07:00
Junio C Hamano
29860f3282 Merge branch 'ja/doc-reset-mv-rm-markup-updates'
Doc mark-up updates.

* ja/doc-reset-mv-rm-markup-updates:
  doc: add markup for characters in Guidelines
  doc: fix asciidoctor synopsis processing of triple-dots
  doc: convert git-mv to new documentation format
  doc: move synopsis git-mv commands in the synopsis section
  doc: convert git-rm to new documentation format
  doc: fix synopsis analysis logic
  doc: convert git-reset to new documentation format
2025-04-23 13:58:51 -07:00
Josh Heinrichs
eb2d7beb0e maintenance: fix launchctl calendar intervals
When using the launchctl scheduler, the weekly job runs daily, and the
daily job runs on the first six days of each month. This appears to be
due to specifying "Day" in the calendar intervals, which according to
launchd.plist(5) is for specifying days of the month rather than days of
the week. The behaviour of running a job on the 0th day is undocumented,
but in my testing appears to be the same as not specifying "Day" in the
calendar interval, in which case the job will run daily.

Use "Weekday" in the calendar intervals, which is the correct way to
schedule jobs to run on specific days of the week.

Signed-off-by: Josh Heinrichs <joshiheinrichs@gmail.com>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-23 12:58:52 -07:00
Ahelenia Ziemiańska
86eef3541e global: mark usage strings and string tables const
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-21 21:01:19 -07:00
Usman Akinyemi
b502a648ef builtin/difftool: remove unnecessary if statement
Since we already teach the `repo_config()` in "f29f1990b5
(config: teach repo_config to allow `repo` to be NULL, 2025-03-08)"
to allow `repo` to be NULL, no need to check if `repo` is NULL
before calling `repo_config()`.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-20 14:17:22 -07:00
Usman Akinyemi
2e4e439ec2 builtin/add: remove unnecessary if statement
Since we already teach the `repo_config()` in "f29f1990b5
(config: teach repo_config to allow `repo` to be NULL, 2025-03-08)"
to allow `repo` to be NULL, no need to check if `repo` is NULL
before calling `repo_config()`.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-20 14:17:20 -07:00
Junio C Hamano
72801dfde1 Merge branch 'ua/update-update-server-info'
Code simplification.

* ua/update-update-server-info:
  builtin/update-server-info: remove unnecessary if statement
2025-04-17 10:28:19 -07:00
Junio C Hamano
c3ebf18eb2 Merge branch 'en/merge-recursive-debug'
Remove remnants of the recursive merge strategy backend, which was
superseded by the ort merge strategy.

* en/merge-recursive-debug:
  builtin/{merge,rebase,revert}: remove GIT_TEST_MERGE_ALGORITHM
  tests: remove GIT_TEST_MERGE_ALGORITHM and test_expect_merge_algorithm
  merge-recursive.[ch]: thoroughly debug these
  merge, sequencer: switch recursive merges over to ort
  sequencer: switch non-recursive merges over to ort
  merge-ort: enable diff-algorithms other than histogram
  builtin/merge-recursive: switch to using merge_ort_generic()
  checkout: replace merge_trees() with merge_ort_nonrecursive()
2025-04-17 10:28:18 -07:00
Junio C Hamano
fe7ae3b87e Merge branch 'kn/blame-porcelain-unblamable'
"git blame --porcelain" mode now talks about unblamable lines and
lines that are blamed to an ignored commit.

* kn/blame-porcelain-unblamable:
  blame: print unblamable and ignored commits in porcelain mode
2025-04-17 10:28:18 -07:00
Junio C Hamano
b45113f581 Merge branch 'jk/fetch-follow-remote-head-fix'
"git fetch [<remote>]" with only the configured fetch refspec
should be the only thing to update refs/remotes/<remote>/HEAD,
but the code was overly eager to do so in other cases.

* jk/fetch-follow-remote-head-fix:
  fetch: make set_head() call easier to read
  fetch: don't ask for remote HEAD if followRemoteHEAD is "never"
  fetch: only respect followRemoteHEAD with configured refspecs
2025-04-17 10:28:17 -07:00
Patrick Steinhardt
791aeddfa2 parse-options: detect mismatches in integer signedness
It was reported that "t5620-backfill.sh" fails on s390x and sparc64 in a
test that exercises the "--min-batch-size" command line option. The
symptom was that the option didn't seem to have an effect: we didn't
fetch objects with a batch size of 20, but instead fetched all objects
at once.

As it turns out, the root cause is that `--min-batch-size` uses
`OPT_INTEGER()` to parse the command line option. While this macro
expects the caller to pass a pointer to an integer, we instead pass a
pointer to a `size_t`. This coincidentally works on most platforms, but
it breaks apart on the mentioned platforms because they are big endian.

This issue isn't specific to git-backfill(1): there are a couple of
other places where we have the same type confusion going on. This
indicates that the issue really is the interface that the parse-options
subsystem provides -- it is simply too easy to get this wrong as there
isn't any kind of compiler warning, and things just work on the most
common systems.

Address the systemic issue by introducing two new build asserts
`BARF_UNLESS_SIGNED()` and `BARF_UNLESS_UNSIGNED()`. As the names
already hint at, those macros will cause a compiler error when passed a
value that is not signed or unsigned, respectively.

Adapt `OPT_INTEGER()`, `OPT_UNSIGNED()` as well as `OPT_MAGNITUDE()` to
use those asserts. This uncovers a small set of sites where we indeed
have the same bug as in git-backfill(1). Adapt all of them to use the
correct option.

Reported-by: Todd Zullinger <tmz@pobox.com>
Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-17 08:15:16 -07:00
Patrick Steinhardt
09705696f7 parse-options: introduce precision handling for OPTION_INTEGER
The `OPTION_INTEGER` option type accepts a signed integer. The type of
the underlying integer is a simple `int`, which restricts the range of
values accepted by such options. But there is a catch: because the
caller provides a pointer to the value via the `.value` field, which is
a simple void pointer. This has two consequences:

  - There is no check whether the passed value is sufficiently long to
    store the entire range of `int`. This can lead to integer wraparound
    in the best case and out-of-bounds writes in the worst case.

  - Even when a caller knows that they want to store a value larger than
    `INT_MAX` they don't have a way to do so.

In practice this doesn't tend to be a huge issue because users typically
don't end up passing huge values to most commands. But the parsing logic
is demonstrably broken, and it is too easy to get the calling convention
wrong.

Improve the situation by introducing a new `precision` field into the
structure. This field gets assigned automatically by `OPT_INTEGER_F()`
and tracks the size of the passed value. Like this it becomes possible
for the caller to pass arbitrarily-sized integers and the underlying
logic knows to handle it correctly by doing range checks. Furthermore,
convert the code to use `strtoimax()` intstead of `strtol()` so that we
can also parse values larger than `LONG_MAX`.

Note that we do not yet assert signedness of the passed variable, which
is another source of bugs. This will be handled in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-17 08:15:15 -07:00
Patrick Steinhardt
785c17df78 parse-options: rename OPT_MAGNITUDE() to OPT_UNSIGNED()
With the preceding commit, `OPT_INTEGER()` has learned to support unit
factors. Consequently, the major differencen between `OPT_INTEGER()` and
`OPT_MAGNITUDE()` isn't the support of unit factors anymore, as both of
them do support them now. Instead, the difference is that one handles
signed and the other handles unsigned integers.

Adapt the name of `OPT_MAGNITUDE()` accordingly by renaming it to
`OPT_UNSIGNED()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-17 08:15:15 -07:00
Patrick Steinhardt
d012ceb5f3 global: use designated initializers for options
While we expose macros for most of our different option types understood
by the "parse-options" subsystem, not every combination of fields that
has one as that would otherwise quickly lead to an explosion of macros.
Instead, we just initialize structures manually for those variants of
fields that don't have a macro.

Callsites that open-code these structure initialization don't use
designated initializers though and instead just provide values for each
of the fields that they want to initialize. This has three significant
downsides:

  - Callsites need to specify all values up to the last field that they
    care about. This often includes fields that should simply be left at
    their default zero-initialized state, which adds distraction.

  - Any reader not deeply familiar with the layout of the structure
    has a hard time figuring out what the respective initializers mean.

  - Reordering or introducing new fields in the middle of the structure
    is impossible without adapting all callsites.

Convert all sites to instead use designated initializers, which we have
started using in our codebase quite a while ago. This allows us to skip
any default-initialized fields, gives the reader context by specifying
the field names and allows us to reorder or introduce new fields where
we want to.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-17 08:15:15 -07:00
Ramsay Jones
c9a51775a3 builtin/gc.c: correct RAM calculation when using sysinfo
The man page for sysinfo(2) on Linux states that (from v2.3.48) the
sizes of the memory and swap fields, of the returned structure, are
given as multiples of 'mem_unit' bytes. In earlier versions (prior to
v2.3.23 on i386 in particular), the 'mem_unit' field was not part of
the structure, and all sizes were measured in bytes. The man page does
not discuss the motivation for this change, but it is possible that the
change was intended for the, relatively rare, 32-bit platform with more
than 4GB of memory.

The total_ram() function makes the assumption that the 'totalram' field
of the 'struct sysinfo' is measured in bytes, or alternatively that the
'mem_unit' field is always equal to one. Having writen a program to call
the sysinfo() function and print the structure fields, it seems that, on
Linux x84_64 and i686 anyway, the 'mem_unit' field is indeed set to one
(note that the 32-bit system had only 2GB ram). However, cygwin also has
an sysinfo() implementation, which gives the following values:

  $ ./sysinfo
  uptime:      21381
  loads:       0, 0, 0
  total ram:   2074637
  free ram:    843237
  shared ram:  0
  buffer ram:  0
  total swap:  327680
  free swap:   306932
  procs:       15
  total high:  0
  free high:   0
  mem_unit:    4096

  total ram: 8497713152
  $

[This laptop has 8GB ram, so a little bit seems to be missing. ;) ]

Modify the total_ram() function to allow for the possibility that the
memory size is not specified in bytes (ie 'mem_unit' is greater than
one).

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-16 20:43:45 -07:00
Junio C Hamano
a271b05066 Merge branch 'ps/cat-file-filter-batch'
"git cat-file --batch" and friends learned to allow "--filter=" to
omit certain objects, just like the transport layer does.

* ps/cat-file-filter-batch:
  builtin/cat-file: use bitmaps to efficiently filter by object type
  builtin/cat-file: deduplicate logic to iterate over all objects
  pack-bitmap: introduce function to check whether a pack is bitmapped
  pack-bitmap: add function to iterate over filtered bitmapped objects
  pack-bitmap: allow passing payloads to `show_reachable_fn()`
  builtin/cat-file: support "object:type=" objects filter
  builtin/cat-file: support "blob:limit=" objects filter
  builtin/cat-file: support "blob:none" objects filter
  builtin/cat-file: wire up an option to filter objects
  builtin/cat-file: introduce function to report object status
  builtin/cat-file: rename variable that tracks usage
2025-04-16 13:54:21 -07:00
Junio C Hamano
47478802da Merge branch 'kn/non-transactional-batch-updates'
Updating multiple references have only been possible in all-or-none
fashion with transactions, but it can be more efficient to batch
multiple updates even when some of them are allowed to fail in a
best-effort manner.  A new "best effort batches of updates" mode
has been introduced.

* kn/non-transactional-batch-updates:
  update-ref: add --batch-updates flag for stdin mode
  refs: support rejection in batch updates during F/D checks
  refs: implement batch reference update support
  refs: introduce enum-based transaction error types
  refs/reftable: extract code from the transaction preparation
  refs/files: remove duplicate duplicates check
  refs: move duplicate refname update check to generic layer
  refs/files: remove redundant check in split_symref_update()
2025-04-16 13:54:19 -07:00
Junio C Hamano
01a6e244f9 Merge branch 'ps/maintenance-reflog-expire'
"git maintenance" learns a new task to expire reflog entries.

* ps/maintenance-reflog-expire:
  builtin/maintenance: introduce "reflog-expire" task
  builtin/gc: split out function to expire reflog entries
  builtin/reflog: make functions regarding `reflog_expire_options` public
  builtin/reflog: stop storing per-reflog expiry dates globally
  builtin/reflog: stop storing default reflog expiry dates globally
  reflog: rename `cmd_reflog_expire_cb` to `reflog_expire_options`
2025-04-16 13:54:19 -07:00
Junio C Hamano
1a1661bd41 Merge branch 'jt/rev-list-z'
"git rev-list" learns machine-parsable output format that delimits
each field with NUL.

* jt/rev-list-z:
  rev-list: support NUL-delimited --missing option
  rev-list: support NUL-delimited --boundary option
  rev-list: support delimiting objects with NUL bytes
  rev-list: refactor early option parsing
  rev-list: inline `show_object_with_name()` in `show_object()`
2025-04-16 13:54:18 -07:00
Junio C Hamano
743d3a54f2 Merge branch 'ab/rm-sign-compare'
Some warnings from "-Wsign-compare" for builtin/rm.c have been
squelched.

* ab/rm-sign-compare:
  rm: fix sign comparison warnings
2025-04-16 13:54:17 -07:00
Junio C Hamano
518ed014f6 Merge branch 'jt/ref-transaction-abort-fix'
A ref transaction corner case fix.

* jt/ref-transaction-abort-fix:
  builtin/fetch: avoid aborting closed reference transaction
2025-04-16 13:54:17 -07:00
Junio C Hamano
7b03646f85 Merge branch 'js/comma-semicolon-confusion'
Code clean-up.

* js/comma-semicolon-confusion:
  detect-compiler: detect clang even if it found CUDA
  clang: warn when the comma operator is used
  compat/regex: explicitly mark intentional use of the comma operator
  wildmatch: avoid using of the comma operator
  diff-delta: avoid using the comma operator
  xdiff: avoid using the comma operator unnecessarily
  clar: avoid using the comma operator unnecessarily
  kwset: avoid using the comma operator unnecessarily
  rebase: avoid using the comma operator unnecessarily
  remote-curl: avoid using the comma operator unnecessarily
2025-04-15 13:50:17 -07:00
Junio C Hamano
a8c207797f Merge branch 'jt/clone-guess-remote-head-fix'
"git clone" still gave the message about the default branch name;
this message has been turned into an advice message that can be
turned off.

* jt/clone-guess-remote-head-fix:
  advice: allow disabling default branch name advice
  builtin/clone: suppress unexpected default branch advice
  remote: allow `guess_remote_head()` to suppress advice
2025-04-15 13:50:16 -07:00
Junio C Hamano
d690c44846 Merge branch 'ds/maintenance-loose-objects-batchsize'
The job to coalesce loose objects into packfiles in "git
maintenance" now has configurable batch size.

* ds/maintenance-loose-objects-batchsize:
  maintenance: add loose-objects.batchSize config
  maintenance: force progress/no-quiet to children
2025-04-15 13:50:16 -07:00
Junio C Hamano
03633a288c Merge branch 'kn/reflog-drop'
"git reflog" learns "drop" subcommand, that discards the entire
reflog data for a ref.

* kn/reflog-drop:
  reflog: implement subcommand to drop reflogs
  reflog: improve error for when reflog is not found
2025-04-15 13:50:15 -07:00
Junio C Hamano
ee847e0034 Merge branch 'ps/object-wo-the-repository'
The object layer has been updated to take an explicit repository
instance as a parameter in more code paths.

* ps/object-wo-the-repository:
  hash: stop depending on `the_repository` in `null_oid()`
  hash: fix "-Wsign-compare" warnings
  object-file: split out logic regarding hash algorithms
  delta-islands: stop depending on `the_repository`
  object-file-convert: stop depending on `the_repository`
  pack-bitmap-write: stop depending on `the_repository`
  pack-revindex: stop depending on `the_repository`
  pack-check: stop depending on `the_repository`
  environment: move access to "core.bigFileThreshold" into repo settings
  pack-write: stop depending on `the_repository` and `the_hash_algo`
  object: stop depending on `the_repository`
  csum-file: stop depending on `the_repository`
2025-04-15 13:50:15 -07:00
Patrick Steinhardt
68cd492a3e object-store: merge "object-store-ll.h" and "object-store.h"
The "object-store-ll.h" header has been introduced to keep transitive
header dependendcies and compile times at bay. Now that we have created
a new "object-store.c" file though we can easily move the last remaining
additional bit of "object-store.h", the `odb_path_map`, out of the
header.

Do so. As the "object-store.h" header is now equivalent to its low-level
alternative we drop the latter and inline it into the former.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15 08:24:37 -07:00
Patrick Steinhardt
70c0f9db4e object-file: split up concerns of HASH_* flags
The functions `hash_object_file()`, `write_object_file()` and
`index_fd()` reuse the same set of flags to alter their behaviour. This
not only adds confusion, but given that every function only supports a
subset of the flags it becomes very hard to see which flags can be
passed to what function. Last but not least, this entangles the
implementation of all three function families.

Split up concerns by creating separate flags for each of the function
families.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15 08:24:36 -07:00
Patrick Steinhardt
d9f517d051 object-file: split out functions relating to object store subsystem
While we have the "object-store.h" header, most of the functionality for
object stores is actually hosted in "object-file.c". This makes it hard
to find relevant functions and causes us to mix up concerns.

Split out functions relating to the object store subsystem into a new
"object-store.c" file.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15 08:24:36 -07:00
Patrick Steinhardt
1a99fe8010 object-file: move safe_create_leading_directories() into "path.c"
The `safe_create_leading_directories()` function and its relatives are
located in "object-file.c", which is not a good fit as they provide
generic functionality not related to objects at all. Move them into
"path.c", which already hosts `safe_create_dir()` and its relative
`safe_create_dir_in_gitdir()`.

"path.c" is free of `the_repository`, but the moved functions depend on
`the_repository` to read the "core.sharedRepository" config. Adapt the
function signature to accept a repository as argument to fix the issue
and adjust callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15 08:24:35 -07:00
Patrick Steinhardt
d1fa670de0 object-file: move mkdir_in_gitdir() into "path.c"
The `mkdir_in_gitdir()` function is similar to `safe_create_dir()`, but
the former is hosted in "object-file.c" whereas the latter is hosted in
"path.c". The latter code unit makes way more sense though as the logic
has nothing to do with object files in particular.

Move the file into "path.c". While at it, we:

  - Rename the function to `safe_create_dir_in_gitdir()` so that the
    function names are similar to one another.

  - Remove the dependency on `the_repository` by making the callers pass
    the repository instead.

Adjust callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15 08:24:34 -07:00
Jean-Noël Avila
1d5378a8c4 doc: convert git-mv to new documentation format
- Switch the synopsis to a synopsis block which will automatically
  format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Unfortunately, there's an inconsistency in the synopsis style, where
the ellipsis is used to indicate that the option can be repeated, but
it can also be used in Git's three-dot notation to indicate a range of
commits. The rendering engine will not be able to distinguish
between these two cases.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-14 14:43:52 -07:00
Jean-Noël Avila
8d34d3379f doc: move synopsis git-mv commands in the synopsis section
This also entails changing the help output for the command to match the new
synopsis.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-14 14:43:52 -07:00