Commit Graph

13285 Commits

Author SHA1 Message Date
Patrick Steinhardt
bd1855b897 odb: rename FOR_EACH_OBJECT_* flags
Rename the `FOR_EACH_OBJECT_*` flags to have an `ODB_` prefix. This
prepares us for a new upcoming `odb_for_each_object()` function and
ensures that both the function and its flags have the same prefix.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-26 08:26:06 -08:00
Junio C Hamano
ec16dde5c8 Merge branch 'ps/packfile-store-in-odb-source' into ps/odb-for-each-object
* ps/packfile-store-in-odb-source:
  packfile: move MIDX into packfile store
  packfile: refactor `find_pack_entry()` to work on the packfile store
  packfile: inline `find_kept_pack_entry()`
  packfile: only prepare owning store in `packfile_store_prepare()`
  packfile: only prepare owning store in `packfile_store_get_packs()`
  packfile: move packfile store into object source
  packfile: refactor misleading code when unusing pack windows
  packfile: refactor kept-pack cache to work with packfile stores
  packfile: pass source to `prepare_pack()`
  packfile: create store via its owning source
  odb: properly close sources before freeing them
  builtin/gc: fix condition for whether to write commit graphs
2026-01-15 05:50:16 -08:00
Junio C Hamano
c8e1706e8d Merge branch 'ps/read-object-info-improvements' into ps/odb-for-each-object
* ps/read-object-info-improvements:
  packfile: drop repository parameter from `packed_object_info()`
  packfile: skip unpacking object header for disk size requests
  packfile: disentangle return value of `packed_object_info()`
  packfile: always populate pack-specific info when reading object info
  packfile: extend `is_delta` field to allow for "unknown" state
  packfile: always declare object info to be OI_PACKED
  object-file: always set OI_LOOSE when reading object info
2026-01-15 05:47:47 -08:00
Patrick Steinhardt
12d3b58b55 packfile: drop repository parameter from packed_object_info()
The function `packed_object_info()` takes a packfile and offset and
returns the object info for the corresponding object. Despite these two
parameters though it also takes a repository pointer. This is redundant
information though, as `struct packed_git` already has a repository
pointer that is always populated.

Drop the redundant parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 06:51:15 -08:00
Junio C Hamano
3235ef374e Merge branch 'rs/commit-stack'
Code clean-up, unifying various hand-rolled "list of commit
objects" and use the commit_stack API.

* rs/commit-stack:
  commit-reach: use commit_stack
  commit-graph: use commit_stack
  commit: add commit_stack_grow()
  shallow: use commit_stack
  pack-bitmap-write: use commit_stack
  commit: add commit_stack_init()
  test-reach: use commit_stack
  remote: use commit_stack for src_commits
  remote: use commit_stack for sent_tips
  remote: use commit_stack for local_commits
  name-rev: use commit_stack
  midx: use commit_stack
  log: use commit_stack
  revision: export commit_stack
2026-01-12 05:19:52 -08:00
Patrick Steinhardt
8384cbcb4c packfile: only prepare owning store in packfile_store_prepare()
When calling `packfile_store_prepare()` we prepare not only the provided
packfile store, but also all those of all other sources part of the same
object database. This was required when the store was still sitting on
the object database level. But now that it sits on the source level it's
not anymore.

Refactor the code so that we only prepare the single packfile store
passed by the caller. Adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:40:07 -08:00
Patrick Steinhardt
84f0e60b28 packfile: move packfile store into object source
The packfile store is a member of `struct object_database`, which means
that we have a single store per database. This doesn't really make much
sense though: each source connected to the database has its own set of
packfiles, so there is a conceptual mismatch here. This hasn't really
caused much of a problem in the past, but with the advent of pluggable
object databases this is becoming more of a problem because some of the
sources may not even use packfiles in the first place.

Move the packfile store down by one level from the object database into
the object database source. This ensures that each source now has its
own packfile store, and we can eventually start to abstract it away
entirely so that the caller doesn't even know what kind of store it
uses.

Note that we only need to adjust a relatively small number of callers,
way less than one might expect. This is because most callers are using
`repo_for_each_pack()`, which handles enumeration of all packfiles that
exist in the repository. So for now, none of these callers need to be
adapted. The remaining callers that iterate through the packfiles
directly and that need adjustment are those that are a bit more tangled
with packfiles. These will be adjusted over time.

Note that this patch only moves the packfile store, and there is still a
bunch of functions that seemingly operate on a packfile store but that
end up iterating over all sources. These will be adjusted in subsequent
commits.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:40:07 -08:00
Patrick Steinhardt
085de91b95 packfile: refactor kept-pack cache to work with packfile stores
The kept pack cache is a cache of packfiles that are marked as kept
either via an accompanying ".kept" file or via an in-memory flag. The
cache can be retrieved via `kept_pack_cache()`, where one needs to pass
in a repository.

Ultimately though the kept-pack cache is a property of the packfile
store, and this causes problems in a subsequent commit where we want to
move down the packfile store to be a per-object-source entity.

Prepare for this and refactor the kept-pack cache to work on top of a
packfile store instead. While at it, rename both the function and flags
specific to the kept-pack cache so that they can be properly attributed
to the respective subsystems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:40:06 -08:00
Junio C Hamano
d28d2be5f2 Merge branch 'rs/tag-wo-the-repository'
Code clean-up.

* rs/tag-wo-the-repository:
  tag: stop using the_repository
  tag: support arbitrary repositories in parse_tag()
  tag: support arbitrary repositories in gpg_verify_tag()
  tag: use algo of repo parameter in parse_tag_buffer()
2026-01-08 16:40:11 +09:00
Junio C Hamano
f1ec43d4d2 Merge branch 'ps/odb-misc-fixes' into ps/packfile-store-in-odb-source
* ps/odb-misc-fixes:
  odb: properly close sources before freeing them
  builtin/gc: fix condition for whether to write commit graphs
2026-01-07 09:37:29 +09:00
Patrick Steinhardt
b3449b1517 builtin/gc: fix condition for whether to write commit graphs
When performing auto-maintenance we check whether commit graphs need to
be generated by counting the number of commits that are reachable by any
reference, but not covered by a commit graph. This search is performed
by iterating through all references and then doing a depth-first search
until we have found enough commits that are not present in the commit
graph.

This logic has a memory leak though:

  Direct leak of 16 byte(s) in 1 object(s) allocated from:
      #0 0x55555562e433 in malloc (git+0xda433)
      #1 0x555555964322 in do_xmalloc ../wrapper.c:55:8
      #2 0x5555559642e6 in xmalloc ../wrapper.c:76:9
      #3 0x55555579bf29 in commit_list_append ../commit.c:1872:35
      #4 0x55555569f160 in dfs_on_ref ../builtin/gc.c:1165:4
      #5 0x5555558c33fd in do_for_each_ref_iterator ../refs/iterator.c:431:12
      #6 0x5555558af520 in do_for_each_ref ../refs.c:1828:9
      #7 0x5555558ac317 in refs_for_each_ref ../refs.c:1833:9
      #8 0x55555569e207 in should_write_commit_graph ../builtin/gc.c:1188:11
      #9 0x55555569c915 in maintenance_is_needed ../builtin/gc.c:3492:8
      #10 0x55555569b76a in cmd_maintenance ../builtin/gc.c:3542:9
      #11 0x55555575166a in run_builtin ../git.c:506:11
      #12 0x5555557502f0 in handle_builtin ../git.c:779:9
      #13 0x555555751127 in run_argv ../git.c:862:4
      #14 0x55555575007b in cmd_main ../git.c:984:19
      #15 0x5555557523aa in main ../common-main.c:9:11
      #16 0x7ffff7a2a4d7 in __libc_start_call_main (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a4d7) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #17 0x7ffff7a2a59a in __libc_start_main@GLIBC_2.2.5 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a59a) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #18 0x5555555f0934 in _start (git+0x9c934)

The root cause of this memory leak is our use of `commit_list_append()`.
This function expects as parameters the item to append and the _tail_ of
the list to append. This tail will then be overwritten with the new tail
of the list so that it can be used in subsequent calls. But we call it
with `commit_list_append(parent->item, &stack)`, so we end up losing
everything but the new item.

This issue only surfaces when counting merge commits. Next to being a
memory leak, it also shows that we're in fact miscounting as we only
respect children of the last parent. All previous parents are discarded,
so their children will be disregarded unless they are hit via another
reference.

While crafting a test case for the issue I was puzzled that I couldn't
establish the proper border at which the auto-condition would be
fulfilled. As it turns out, there's another bug: if an object is at the
tip of any reference we don't mark it as seen. Consequently, if it is
the tip of or reachable via another ref, we'd count that object multiple
times.

Fix both of these bugs so that we properly count objects without leaking
any memory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:16:50 +09:00
Junio C Hamano
f406b89552 Merge branch 'ar/run-command-hook'
Use hook API to replace ad-hoc invocation of hook scripts with the
run_command() API.

* ar/run-command-hook:
  receive-pack: convert receive hooks to hook API
  receive-pack: convert update hooks to new API
  hooks: allow callers to capture output
  run-command: allow capturing of collated output
  hook: allow overriding the ungroup option
  reference-transaction: use hook API instead of run-command
  transport: convert pre-push to hook API
  hook: convert 'post-rewrite' hook in sequencer.c to hook API
  hook: provide stdin via callback
  run-command: add stdin callback for parallelization
  run-command: add first helper for pp child states
2026-01-06 16:33:53 +09:00
Junio C Hamano
1627809eef Merge branch 'rs/show-branch-prio-queue'
Code clean-up.

* rs/show-branch-prio-queue:
  show-branch: use prio_queue
2026-01-06 16:33:52 +09:00
Junio C Hamano
8fb86e1a42 Merge branch 'bc/checkout-error-message-fix'
Message fix.

* bc/checkout-error-message-fix:
  checkout: quote invalid treeish in error message
2026-01-06 16:33:52 +09:00
Junio C Hamano
02e9bc3392 Merge branch 'jt/repo-struct-more-objinfo'
More object database related information are shown in "git repo
structure" output.

* jt/repo-struct-more-objinfo:
  builtin/repo: add object disk size info to structure table
  builtin/repo: add disk size info to keyvalue stucture output
  builtin/repo: add inflated object info to structure table
  builtin/repo: add inflated object info to keyvalue structure output
  builtin/repo: humanise count values in structure output
  strbuf: split out logic to humanise byte values
  builtin/repo: group per-type object values into struct
2025-12-30 12:58:19 +09:00
René Scharfe
b6e4cc8c32 tag: support arbitrary repositories in parse_tag()
Allow callers of parse_tag() pass in the repository to use.  Let most of
them pass in the_repository to get the same result as before.  One of
them has stopped using the_repository in ef9b0370da (sha1-name.c: store
and use repo in struct disambiguate_state, 2019-04-16); let it pass in
its stored repository.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-29 22:02:54 +09:00
René Scharfe
154717b3b0 tag: support arbitrary repositories in gpg_verify_tag()
Allow callers of gpg_verify_tag() specify the repository to use by
providing a parameter for that.  One of the two has not been using
the_repository since 43a8391977 (builtin/verify-tag: stop using
`the_repository`, 2025-03-08); let it pass in the correct repository.
The other simply passes the_repository to get the same result as before.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-29 22:02:53 +09:00
Emily Shaffer
c65f26fca4 receive-pack: convert receive hooks to hook API
This converts the last remaining hooks to the new hook API, for
the same benefits as the previous conversions (no need to toggle
signals, manage custom struct child_process, call find_hook(),
prepares for specifyinig hooks via configs, etc.).

I noticed a performance degradation when processing large amounts
of hook input with just 1 line per callback, due to run-command's
poll loop, therefore I batched 500 lines per callback, to ensure
similar pipe throughput as before and to avoid hook child waiting
on stdin.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-28 14:02:07 +09:00
Emily Shaffer
0bbaf3653f receive-pack: convert update hooks to new API
Use the new hook sideband API introduced in the previous commit.

The hook API avoids creating a custom struct child_process and other
internal hook plumbing (e.g. calling find_hook()) and prepares for
the specification of hooks via configs or running parallel hooks.

Execution is still sequential through the current hook.[ch] via the
run_process_parallel_opts.processes=1 arg.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-28 14:02:07 +09:00
Adrian Ratiu
857f047e40 hook: allow overriding the ungroup option
When calling run_process_parallel() in run_hooks_opt(), the
ungroup option is currently hardcoded to .ungroup = 1.

This causes problems when ungrouping should be disabled, for
example when sideband-reading collated output from child hooks,
because sideband-reading and ungrouping are mutually exclusive.

Thus a new hook.h option is added to allow overriding.

The existing ungroup=1 behavior is preserved in the run_hooks()
API and the "hook run" command. We could modify these to take
an option if necessary, so I added two code comments there.

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-28 14:02:07 +09:00
René Scharfe
abf05d856f show-branch: use prio_queue
Building a list using commit_list_insert_by_date() has quadratic worst
case complexity.  Avoid it by using prio_queue.

Use prio_queue_peek()+prio_queue_replace() instead of prio_queue_get()+
prio_queue_put() if possible, as the former only rebalance the
prio_queue heap once instead of twice.

In sane repositories this won't make much of a difference because the
number of items in the list or queue won't be very high:

Benchmark 1: ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo
  Time (mean ± σ):     538.2 ms ±   0.8 ms    [User: 527.6 ms, System: 9.6 ms]
  Range (min … max):   537.0 ms … 539.2 ms    10 runs

Benchmark 2: ./git show-branch origin/main origin/next origin/seen origin/todo
  Time (mean ± σ):     530.6 ms ±   0.4 ms    [User: 519.8 ms, System: 9.8 ms]
  Range (min … max):   530.1 ms … 531.3 ms    10 runs

Summary
  ./git show-branch origin/main origin/next origin/seen origin/todo ran
    1.01 ± 0.00 times faster than ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo

That number is not limited, though, and in pathological cases like the
one in p6010 we see a sizable improvement:

Test                      v2.52.0           HEAD
------------------------------------------------------------------
6010.4: git show-branch   2.19(2.19+0.00)   0.03(0.02+0.00) -98.6%

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-28 14:01:23 +09:00
René Scharfe
d78039cd50 name-rev: use commit_stack
Simplify the code by using commit_stack instead of open-coding it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-25 08:29:27 +09:00
René Scharfe
052efdd60f log: use commit_stack
Calling commit_stack_push() to add commits is simpler and more efficient
than using REALLOC_ARRAY.  Calling commit_stack_pop() to consume them in
LIFO order is also a tad simpler than calculating the array index from
the end.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-25 08:29:27 +09:00
brian m. carlson
93f894c001 checkout: quote invalid treeish in error message
We received a report that invoking "git restore -source my_base_branch"
resulted in the confusing error message "fatal: could not resolve
ource".  This looked like a typo in our error message, but it is
actually because "-source" is missing its second dash and is being
resolved as "-s ource".  However, due to the lack of the quoting
recommended in CodingGuidelines, this is confusing to the reader and
we can do better.

Add the necessary quoting to this message.  With this change, we now get
this less confusing message:

    fatal: could not resolve 'ource'

Reported-by: Zhelyo Zhelev <zhelyo@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-25 08:27:22 +09:00
Junio C Hamano
d8000781eb Merge branch 'kn/fix-fetch-backfill-tag-with-batched-ref-updates'
"git fetch" that involves fetching tags, when a tag being fetched
needs to overwrite existing one, failed to fetch other tags, which
has been corrected.

* kn/fix-fetch-backfill-tag-with-batched-ref-updates:
  fetch: fix failed batched updates skipping operations
  fetch: fix non-conflicting tags not being committed
  fetch: extract out reference committing logic
2025-12-23 11:33:17 +09:00
Junio C Hamano
396df67739 Merge branch 'tc/memzero-array'
MEMZERO_ARRAY() helper is introduced to avoid clearing only the
first N bytes of an N-element array whose elements are larger than
a byte.

* tc/memzero-array:
  contrib/coccinelle: pass include paths to spatch(1)
  git-compat-util: introduce MEMZERO_ARRAY() macro
2025-12-23 11:33:16 +09:00
Junio C Hamano
00bf98b16e Merge branch 'jc/submodule-add'
"git submodule add" to add a submodule under <name> segfaulted,
when a submodule.<name>.something is already in .gitmodules file
without defining where its submodule.<name>.path is, which has been
corrected.

* jc/submodule-add:
  submodule add: sanity check existing .gitmodules
2025-12-23 11:33:15 +09:00
Junio C Hamano
bcc20b8304 Merge branch 'kj/pull-options-decl-cleanup'
Code clean-up.

* kj/pull-options-decl-cleanup:
  pull: move options[] array into function scope
2025-12-22 14:57:49 +09:00
Junio C Hamano
24a51fef5b Merge branch 'rs/replay-wrong-onto-fix'
"git replay --onto=<commit> ...", when <commit> is mistyped,
started to segfault with recent change, which has been corrected.

* rs/replay-wrong-onto-fix:
  replay: move onto NULL check before first use
2025-12-22 14:57:48 +09:00
Junio C Hamano
6a3051d3c2 Merge branch 'kh/doc-replay-updates'
"git replay" documentation updates.

* kh/doc-replay-updates:
  doc: replay: link section using markup
  replay: improve --contained and add to doc
  doc: replay: mention no output on conflicts
2025-12-22 14:57:48 +09:00
Justin Tobler
df1b071fed builtin/repo: add object disk size info to structure table
Similar to a prior commit, update the table output format for the
git-repo(1) structure command to display the total object disk usage by
object type.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-18 09:02:32 +09:00
Justin Tobler
67cecc693f builtin/repo: add disk size info to keyvalue stucture output
Similar to a prior commit, extend the keyvalue and nul output formats of
the git-repo(1) structure command to additionally provide info regarding
total object disk sizes by object type.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-18 09:02:32 +09:00
Justin Tobler
4d279ae36b builtin/repo: add inflated object info to structure table
Update the table output format for the git-repo(1) structure command to
begin printing the total inflated object size info by object type. To be
more human-friendly, larger values are scaled down and displayed with
the appropriate unit prefix. Output for the keyvalue and nul formats
remains unchanged.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-18 09:02:31 +09:00
Justin Tobler
3e114496e4 builtin/repo: add inflated object info to keyvalue structure output
The structure subcommand for git-repo(1) outputs basic count information
for objects and references. Extend this output to also provide
information regarding total size of inflated objects by object type.

For now, object size by object type info is only added to the keyvalue
and nul output formats. In a subsequent commit, this info is also added
to the table format.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-18 09:02:31 +09:00
Justin Tobler
54731320cc builtin/repo: humanise count values in structure output
The table output format for the git-repo(1) structure subcommand is used
by default and intended to provide output to users in a human-friendly
manner. When the reference/object count values in a repository are
large, it becomes more cumbersome for users to read the values.

For larger values, update the table output format to instead produce
more human-friendly count values that are scaled down with the
appropriate unit prefix. Output for the keyvalue and nul formats remains
unchanged.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-18 09:02:31 +09:00
Justin Tobler
9faaf254ba builtin/repo: group per-type object values into struct
The `object_stats` structure stores object counts by type. In a
subsequent commit, additional per-type object measurements will also be
stored. Group per-type object values into a new struct to allow better
reuse.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-18 09:02:31 +09:00
Junio C Hamano
dbe54273a7 Merge branch 'ps/object-read-stream'
The "git_istream" abstraction has been revamped to make it easier
to interface with pluggable object database design.

* ps/object-read-stream:
  streaming: drop redundant type and size pointers
  streaming: move into object database subsystem
  streaming: refactor interface to be object-database-centric
  streaming: move logic to read packed objects streams into backend
  streaming: move logic to read loose objects streams into backend
  streaming: make the `odb_read_stream` definition public
  streaming: get rid of `the_repository`
  streaming: rely on object sources to create object stream
  packfile: introduce function to read object info from a store
  streaming: move zlib stream into backends
  streaming: create structure for filtered object streams
  streaming: create structure for packed object streams
  streaming: create structure for loose object streams
  streaming: create structure for in-core object streams
  streaming: allocate stream inside the backend-specific logic
  streaming: explicitly pass packfile info when streaming a packed object
  streaming: propagate final object type via the stream
  streaming: drop the `open()` callback function
  streaming: rename `git_istream` into `odb_read_stream`
2025-12-16 11:08:34 +09:00
Junio C Hamano
f1799202ea Merge branch 'ps/object-read-stream' into ps/packfile-store-in-odb-source
* ps/object-read-stream:
  streaming: drop redundant type and size pointers
  streaming: move into object database subsystem
  streaming: refactor interface to be object-database-centric
  streaming: move logic to read packed objects streams into backend
  streaming: move logic to read loose objects streams into backend
  streaming: make the `odb_read_stream` definition public
  streaming: get rid of `the_repository`
  streaming: rely on object sources to create object stream
  packfile: introduce function to read object info from a store
  streaming: move zlib stream into backends
  streaming: create structure for filtered object streams
  streaming: create structure for packed object streams
  streaming: create structure for loose object streams
  streaming: create structure for in-core object streams
  streaming: allocate stream inside the backend-specific logic
  streaming: explicitly pass packfile info when streaming a packed object
  streaming: propagate final object type via the stream
  streaming: drop the `open()` callback function
  streaming: rename `git_istream` into `odb_read_stream`
2025-12-15 17:40:31 +09:00
Junio C Hamano
affdbe41bd Merge branch 'lo/repo-struct-z'
"git repo struct" learned to take "-z" as a synonym to "--format=nul".

* lo/repo-struct-z:
  repo: add -z as an alias for --format=nul to git-repo-structure
  repo: use [--format=... | -z] instead of [-z] in git-repo-info synopsis
  repo: remove blank line from Documentation/git-repo.adoc
2025-12-14 17:04:37 +09:00
Junio C Hamano
2378ebcb58 Merge branch 'kh/advise-w-git-help-in-branch'
A help message from "git branch" now mentions "git help" instead of
"man" when suggesting to read some documentation.

* kh/advise-w-git-help-in-branch:
  branch: advice using git-help(1) instead of man(1)
2025-12-14 17:04:37 +09:00
Junio C Hamano
21787077bf Merge branch 'js/last-modified-with-sparse-checkouts'
"git last-modified" used to mishandle "--" to mark the beginning of
pathspec, which has been corrected.

* js/last-modified-with-sparse-checkouts:
  last-modified: support sparse checkouts
2025-12-14 17:04:37 +09:00
Junio C Hamano
794c979889 Merge branch 'tc/last-modified-active-paths-optimization'
Recent optimization to "last-modified" command introduced use of
uninitialized block of memory, which has been corrected.

* tc/last-modified-active-paths-optimization:
  last-modified: fix use of uninitialized memory
2025-12-14 17:04:36 +09:00
Kristoffer Haugsbakk
03d7c9c457 replay: improve --contained and add to doc
There is no documentation for `--contained`.

Start by copying the text from `replay_options` in `builtin/
replay.c`. But some people think that the existing text is a
bit unclear; what does it mean for a branch to be contained
in a revision range? Let’s include the implied commits here:
the branches that point at commits in the range.

Also use “update” instead of “advance”. “Update” is the verb
commonly used in this context.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-14 15:56:02 +09:00
K Jayatheerth
bab391761d pull: move options[] array into function scope
Unless there are good reasons, it is customary to have the options[]
array used with the parse-options API declared in function scope rather
than at file scope.

Move builtin/pull.c:cmd_pull()’s options[] array into the function to
match that convention.

Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-12 22:08:02 +09:00
René Scharfe
a4a77e41fa replay: move onto NULL check before first use
cmd_replay() aborts if the pointer "onto" is NULL after argument
parsing, e.g. when specifying a non-existing commit with --onto.
15cd4ef1f4 (replay: make atomic ref updates the default behavior,
2025-11-06) added code that dereferences this pointer before the check.
Switch their places to avoid a segmentation fault.

Reported-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-12 12:41:26 +09:00
Junio C Hamano
8cb4a11438 Merge branch 'sa/replay-atomic-ref-updates' into rs/replay-wrong-onto-fix
* sa/replay-atomic-ref-updates:
  replay: add replay.refAction config option
  replay: make atomic ref updates the default behavior
  replay: use die_for_incompatible_opt2() for option validation
2025-12-12 12:41:17 +09:00
Toon Claes
a67b902c94 git-compat-util: introduce MEMZERO_ARRAY() macro
Introduce a new macro MEMZERO_ARRAY() that zeroes the memory allocated
by ALLOC_ARRAY() and friends. And add coccinelle rule to enforce the use
of this macro.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11 14:44:43 +09:00
Karthik Nayak
b7b17ec8a6 fetch: fix failed batched updates skipping operations
Fix a regression introduced with batched updates in 0e358de64a (fetch:
use batched reference updates, 2025-05-19) when fetching references. In
the `do_fetch()` function, we jump to cleanup if committing the
transaction fails, regardless of whether using batched or atomic
updates. This skips three subsequent operations:

  - Update 'FETCH_HEAD' as part of `commit_fetch_head()`.

  - Add upstream tracking information via `set_upstream()`.

  - Setting remote 'HEAD' values when `do_set_head` is true.

For atomic updates, this is expected behavior. For batched updates,
we want to continue with these operations even if some refs fail to
update.

Skipping `commit_fetch_head()` isn't actually a regression because
'FETCH_HEAD' is already updated via `append_fetch_head()` when not
using '--atomic'. However, we add a test to validate this behavior.

Skipping the other two operations (upstream tracking and remote HEAD)
is a regression. Fix this by only jumping to cleanup when using
'--atomic', allowing batched updates to continue with post-fetch
operations. Add tests to prevent future regressions.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-10 20:59:58 +09:00
Karthik Nayak
8ff2eef8ad fetch: fix non-conflicting tags not being committed
The commit 0e358de64a (fetch: use batched reference updates, 2025-05-19)
updated the 'git-fetch(1)' command to use batched updates. This batches
updates to gain performance improvements. When fetching references, each
update is added to the transaction. Finally, when committing, individual
updates are allowed to fail with reason, while the transaction itself
succeeds.

One scenario which was missed here, was fetching tags. When fetching
conflicting tags, the `fetch_and_consume_refs()` function returns '1',
which skipped committing the transaction and directly jumped to the
cleanup section. This mean that no updates were applied. This also
extends to backfilling tags which is done when fetching specific
refspecs which contains tags in their history.

Fix this by committing the transaction when we have an error code and
not using an atomic transaction. This ensures other references are
applied even when some updates fail.

The cleanup section is reached with `retcode` set in several scenarios:

   - `truncate_fetch_head()`, `open_fetch_head()` and `prune_refs()` set
     `retcode` before the transaction is created, so no commit is
     attempted.

   - `fetch_and_consume_refs()` and `backfill_tags()` are the primary
     cases this fix targets, both setting a positive `retcode` to
     trigger the committing of the transaction.

This simplifies error handling and ensures future modifications to
`do_fetch()` don't need special handling for batched updates.

Add tests to check for this regression. While here, add a missing
cleanup from previous test.

Reported-by: David Bohman <debohman@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-10 20:59:58 +09:00
Junio C Hamano
bbefa15ff5 Merge branch 'en/replay-doc-revision-range'
The use of "revision" (a connected set of commits) has been
clarified in the "git replay" documentation.

* en/replay-doc-revision-range:
  Documentation/git-replay.adoc: fix errors around revision range
2025-12-09 07:54:56 +09:00