global/git

mirror of https://github.com/git/git.git synced 2026-03-31 20:10:10 +02:00

Author	SHA1	Message	Date
shejialuo	3473d18fad	fsck: add a unified interface for reporting fsck messages The static function "report" provided by "fsck.c" aims at checking error type and calling the callback "error_func" to report the message. Both refs and objects need to check the error type of the current fsck message. In order to extract this common behavior, create a new function "fsck_vreport". Instead of using "...", provide "va_list" to allow more flexibility. Instead of changing "report" prototype to be align with the "fsck_vreport" function, we leave the "report" prototype unchanged due to the reason that there are nearly 62 references about "report" function. Simply change "report" function to use "fsck_vreport" to report objects related messages. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:36:52 -07:00
shejialuo	0ec5dfe8c4	fsck: make "fsck_error" callback generic The "fsck_error" callback is designed to report the objects-related error messages. It accepts two parameter "oid" and "object_type" which is not generic. In order to provide a unified callback which can report either objects or refs, remove the objects-related parameters and add the generic parameter "void *fsck_report". Create a new "fsck_object_report" structure which incorporates the removed parameters "oid" and "object_type". Then change the corresponding references to adapt to new "fsck_error" callback. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:36:52 -07:00
shejialuo	8cd4a447b8	fsck: rename objects-related fsck error functions The names of objects-related fsck error functions are generic. It's OK when there is only object database check. However, we are going to introduce refs database check report function. To avoid ambiguity, rename object-related fsck error functions to explicitly indicate these functions are used to report objects-related messages. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:36:52 -07:00
shejialuo	2d79aa9095	fsck: rename "skiplist" to "skip_oids" The "skiplist" field in "fsck_options" is related to objects. Because we are going to introduce ref consistency check, the "skiplist" name is too general which will make the caller think "skiplist" is related to both the refs and objects. It may seem that for both refs and objects, we should provide a general "skiplist" here. However, the type for "skiplist" is `struct oidset` which is totally unsuitable for refs. To avoid above ambiguity, rename "skiplist" to "skip_oids". Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:36:52 -07:00
Patrick Steinhardt	6f1e9394e2	object: fix leaking packfiles when closing object store When calling `raw_object_store_clear()`, we close and free several resources associated with the object store. Part of that is to close and free all the packfiles, which is handled by `close_object_store()`. That function really only ends up closing the packfiles though, but it doesn't free them. And in fact it can't, as that function is being called via `run_command()` when `close_object_store = 1`, which is done e.g. when we execute git-maintenance(1). At that point, other structures may still have references on those packfiles, and thus we cannot free them here. So while it is in fact intentional that we really only close them, the result is a memory leak because `raw_object_store_clear()` does not free them, either. Fix the leak by freeing the packfiles in `raw_object_store_clear()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:22:21 -07:00
Patrick Steinhardt	fa0f27a19d	submodule: fix leaking seen submodule names We keep track of submodules we have already seen via a string map such that we don't process the same submodule twice. We never free that map though, causing a memory leak. Fix this leak by clearing the map. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:22:21 -07:00
Patrick Steinhardt	1a7e5efdb0	submodule: fix leaking fetch tasks When done with a fetch task used for parallel fetches of submodules, we need to both call `fetch_task_release()` to release the task's contents and `free()` to release the task itself. Most sites do this already, but some only call `fetch_task_release()` and thus leak memory. While we could trivially fix this by adding the two missing calls to free(3P), the result would be that we always call both functions. Let's thus refactor the code such that `fetch_task_release()` also frees the structure itself. Rename it to `fetch_task_free()` accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:22:21 -07:00
Patrick Steinhardt	c369fc46d0	builtin/submodule: allow "add" to use different ref storage format Same as with "clone", users may want to add a submodule to a repository with a non-default ref storage format. Wire up a new `--ref-format=` option that works the same as for `git submodule clone`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:22:21 -07:00
Patrick Steinhardt	fb99dded31	refs: fix ref storage format for submodule ref stores When opening a submodule ref storage we accidentally use the ref storage format of the owning repository, not of the submodule repository. As submodules may have a different storage format than their parent repo this can lead to bugs when trying to access the submodule ref storage from the parent repository. One such bug was reported when performing a recursive pull with mixed ref stores, which fails with: $ git pull --recursive fatal: Unable to find current revision in submodule path 'path/to/sub' The same issue occurs when adding a repository contained in the working tree with a different ref storage format via `git submodule add`. Fix the bug by using the submodule repository's ref storage format instead and add some tests. Note that the test for `git submodule status` was included as a precaution, only. The command worked alright even without the bugfix. Reported-by: Jeppe Øland <joland@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:22:21 -07:00
Patrick Steinhardt	69814846ab	builtin/clone: propagate ref storage format to submodules When recursively cloning a repository with a non-default ref storage format, e.g. by passing the `--ref-format=` option, then only the top-level repository will end up using that ref storage format, and all recursively cloned submodules will instead use the default format. While mixed-format constellations are expected to work alright, the outcome still is somewhat surprising as we have essentially ignored the user's request. Fix this by propagating the requested ref format to cloned submodules. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:21:39 -07:00
Patrick Steinhardt	5ac781ad62	builtin/submodule: allow cloning with different ref storage format As submodules are proper self-contained repositories, it is perfectly valid for them to have a different ref storage format than their parent repository. There is no obvious way for users to ask for the ref storage format when initializing submodules though. Whether the setup of such mixed-ref-storage-format constellations is all that useful remains to be seen. But there is no good reason to not expose such an option, and we will require it in a subsequent patch. Introduce a new `--ref-format=` option for git-submodule(1) that allows the user to pick the ref storage format. This option will also be used in a subsequent commit, where we start to propagate the same flag from git-clone(1) to cloning submodules with the `--recursive` switch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:20:49 -07:00
Patrick Steinhardt	d9ab8788e1	git-submodule.sh: break overly long command lines For most of the subcommands of git-submodule(1), we end up passing a bunch of arguments to the submodule helper. This quickly leads to overly long lines, where it becomes hard to spot what has changed when one needs to modify them. Break up these lines into one argument per line, similarly to how it is done for the "clone" subcommand already. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:20:48 -07:00
Patrick Steinhardt	6ce8ffe30e	transport: mark more tests leak-free After fixing a transport leak, a few more tests have become leak-free. Mark them as such. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:16:21 -07:00
Junio C Hamano	448d51d549	transport: fix leak with transport helper URLs Transport URLs can be prefixed with "foo::", which would tell us that the transport uses a remote helper called "foo". We extract the helper name by `xstrndup()`ing the prefix before the double-colons, but never free that string. Fix this leak by assigning the result to a separate local variable that we can then free upon returning. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-07 17:38:31 -07:00
Junio C Hamano	92a29c2c39	Merge branch 'ps/refs-wo-the-repository' into ps/config-wo-the-repository * ps/refs-wo-the-repository: refs/reftable: stop using `the_repository` refs/packed: stop using `the_repository` refs/files: stop using `the_repository` refs/files: stop using `the_repository` in `parse_loose_ref_contents()` refs: stop using `the_repository`	2024-08-07 14:13:20 -07:00
Junio C Hamano	90b801d8ff	Merge branch 'ps/leakfixes-part-3' into ps/leakfixes-part-4 * ps/leakfixes-part-3: (24 commits) commit-reach: fix trivial memory leak when computing reachability convert: fix leaking config strings entry: fix leaking pathnames during delayed checkout object-name: fix leaking commit list items t/test-repository: fix leaking repository builtin/credential-cache: fix trivial leaks builtin/worktree: fix leaking derived branch names builtin/shortlog: fix various trivial memory leaks builtin/rerere: fix various trivial memory leaks builtin/credential-store: fix leaking credential builtin/show-branch: fix several memory leaks builtin/rev-parse: fix memory leak with `--parseopt` builtin/stash: fix various trivial memory leaks builtin/remote: fix various trivial memory leaks builtin/remote: fix leaking strings in `branch_list` builtin/ls-remote: fix leaking `pattern` strings builtin/submodule--helper: fix leaking buffer in `is_tip_reachable` builtin/submodule--helper: fix leaking clone depth parameter builtin/name-rev: fix various trivial memory leaks builtin/describe: fix trivial memory leak when describing blob ...	2024-08-06 12:40:41 -07:00
Taylor Blau	fcb2205b77	midx: implement support for writing incremental MIDX chains Now that the rest of the MIDX subsystem and relevant callers have been updated to learn about how to read and process incremental MIDX chains, let's finally update the implementation in `write_midx_internal()` to be able to write incremental MIDX chains. This new feature is available behind the `--incremental` option for the `multi-pack-index` builtin, like so: $ git multi-pack-index write --incremental The implementation for doing so is relatively straightforward, and boils down to a handful of different kinds of changes implemented in this patch: - The `compute_sorted_entries()` function is taught to reject objects which appear in any existing MIDX layer. - Functions like `write_midx_revindex()` are adjusted to write pack_order values which are offset by the number of objects in the base MIDX layer. - The end of `write_midx_internal()` is adjusted to move non-incremental MIDX files when necessary (i.e. when creating an incremental chain with an existing non-incremental MIDX in the repository). There are a handful of other changes that are introduced, like new functions to clear incremental MIDX files that are unrelated to the current chain (using the same "keep_hash" mechanism as in the non-incremental case). The tests explicitly exercising the new incremental MIDX feature are relatively limited for two reasons: 1. Most of the "interesting" behavior is already thoroughly covered in t5319-multi-pack-index.sh, which handles the core logic of reading objects through a MIDX. The new tests in t5334-incremental-multi-pack-index.sh are mostly focused on creating and destroying incremental MIDXs, as well as stitching their results together across layers. 2. A new GIT_TEST environment variable is added called "GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL", which modifies the entire test suite to write incremental MIDXs after repacking when combined with the "GIT_TEST_MULTI_PACK_INDEX" variable. This exercises the long tail of other interesting behavior that is defined implicitly throughout the rest of the CI suite. It is likewise added to the linux-TEST-vars job. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:39 -07:00
Taylor Blau	147c3f6740	t/t5313-pack-bounds-checks.sh: prepare for sub-directories Prepare for sub-directories to appear in $GIT_DIR/objects/pack by adjusting the copy, remove, and chmod invocations to perform their behavior recursively. This prepares us for the new $GIT_DIR/objects/pack/multi-pack-index.d directory which will be added in a following commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:39 -07:00
Taylor Blau	9552c3595a	t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' Two years ago, commit `ff1e653c8e` (midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP', 2021-08-31) introduced a new environment variable which caused the test suite to write MIDX bitmaps after any 'git repack' invocation. At the time, this was done to help flush out any bugs with MIDX bitmaps that weren't explicitly covered in the t5326-multi-pack-bitmap.sh script. Two years later, that flag has served us well and is no longer providing meaningful coverage, as the script in t5326 has matured substantially and covers many more interesting cases than it did back when `ff1e653c8e` was originally written. Remove the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable as it is no longer serving a useful purpose. More importantly, removing this variable clears the way for us to introduce a new one to help similarly flush out bugs related to incremental MIDX chains. Because these incremental MIDX chains are (for now) incompatible with MIDX bitmaps, we cannot have both. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:38 -07:00
Taylor Blau	3592796d0a	midx: implement verification support for incremental MIDXs Teach the verification implementation used by `git multi-pack-index verify` to perform verification for incremental MIDX chains by independently validating each layer within the chain. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:38 -07:00
Taylor Blau	b80236d0e3	midx: support reading incremental MIDX chains Now that the MIDX machinery's internals have been taught to understand incremental MIDXs over the previous handful of commits, the MIDX machinery itself can begin reading incremental MIDXs. (Note that while the on-disk format for incremental MIDXs has been defined, the writing end has not been implemented. This will take place in the commit after next.) The core of this change involves following the order specified in the MIDX chain in reverse and opening up MIDXs in the chain one-by-one, adding them to the previous layer's `->base_midx` pointer at each step. In order to implement this, the `load_multi_pack_index()` function is taught to call a new `load_multi_pack_index_chain()` function if loading a non-incremental MIDX failed via `load_multi_pack_index_one()`. When loading a MIDX chain, `load_midx_chain_fd_st()` reads each line in the file one-by-one and dispatches calls to `load_multi_pack_index_one()` to read each layer of the MIDX chain. When a layer was successfully read, it is added to the MIDX chain by calling `add_midx_to_chain()` which validates the contents of the `BASE` chunk, performs some bounds checks on the number of combined packs and objects, and attaches the new MIDX by assigning its `base_midx` pointer to the existing part of the chain. As a supplement to this, introduce a new mode in the test-read-midx test-tool which allows us to read the information for a specific MIDX in the chain by specifying its trailing checksum via the command-line arguments like so: $ test-tool read-midx .git/objects [checksum] Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:38 -07:00
Taylor Blau	97fd770ea1	midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs The function `midx_fanout_add_midx_fanout()` is used to help construct the fanout table when generating a MIDX by reusing data from an existing MIDX. Prepare this function to work with incremental MIDXs by making a few changes: - The bounds checks need to be adjusted to start object lookups taking into account the number of objects in the previous MIDX layer (i.e., by starting the lookups at position `m->num_objects_in_base` instead of position 0). - Likewise, the bounds checks need to end at `m->num_objects_in_base` objects after `m->num_objects`. - Finally, `midx_fanout_add_midx_fanout()` needs to recur on earlier MIDX layers when dealing with an incremental MIDX chain by calling itself when given a MIDX with a non-NULL `base_midx`. Note that after `0c5a62f14b` (midx-write.c: do not read existing MIDX with `packs_to_include`, 2024-06-11), we do not use this function with an existing MIDX (incremental or not) when generating a MIDX with --stdin-packs, and likewise for incremental MIDXs. But it is still used when adding the fanout table from an incremental MIDX when generating a non-incremental MIDX (without --stdin-packs, of course). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:38 -07:00
Taylor Blau	b31f2aac56	midx: teach `midx_preferred_pack()` about incremental MIDXs The function `midx_preferred_pack()` is used to determine the identity of the preferred pack, which is the identity of a unique pack within the MIDX which is used as a tie-breaker when selecting from which pack to represent an object that appears in multiple packs within the MIDX. Historically we have said that the MIDX's preferred pack has the unique property that all objects from that pack are represented in the MIDX. But that isn't quite true: a more precise statement would be that all objects from that pack which appear in the MIDX are selected from that pack. This helps us extend the concept of preferred packs across a MIDX chain, where some object(s) in the preferred pack may appear in other packs in an earlier MIDX layer, in which case those object(s) will not appear in a subsequent MIDX layer from either the preferred pack or any other pack. Extend the concept of preferred packs by using the pack which represents the object at the first position in MIDX pseudo-pack order belonging to the current MIDX layer (i.e., at position 'm->num_objects_in_base'). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:38 -07:00
Taylor Blau	853165c50a	midx: teach `midx_contains_pack()` about incremental MIDXs Now that the `midx_contains_pack()` versus `midx_locate_pack()` debacle has been cleaned up, teach the former about how to operate in an incremental MIDX-aware world in a similar fashion as in previous commits. Instead of using either of the two `midx_for_object()` or `midx_for_pack()` helpers, this function is split into two: one that determines whether a pack is contained in a single MIDX, and another which calls the former in a loop over all MIDXs. This approach does not require that we change any of the implementation in what is now `midx_contains_pack_1()` as it still operates over a single MIDX. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:37 -07:00
Taylor Blau	5d0ee3f675	midx: remove unused `midx_locate_pack()` Commit `307d75bbe6` (midx: implement `midx_locate_pack()`, 2023-12-14) introduced `midx_locate_pack()`, which was described at the time as a complement to the function `midx_contains_pack()` which allowed callers to determine where in the MIDX lexical order a pack appeared, as opposed to whether or not it was simply contained. `307d75bbe6` suggests that future patches would be added which would introduce callers for this new function, but none ever were, meaning the function has gone unused since its introduction. Clean this up by in effect reverting `307d75bbe6`, which removes the unused functions and inlines its definition back into `midx_contains_pack()`. (Looking back through the list archives when `307d75bbe6` was written, this was in preparation for this[1] patch from back when we had the concept of "disjoint" packs while developing multi-pack verbatim reuse. That concept was abandoned before the series was merged, but I never dropped what would become `307d75bbe6` from the series, leading to the state prior to this commit). [1]: https://lore.kernel.org/git/3019738b52ba8cd78ea696a3b800fa91e722eb66.1701198172.git.me@ttaylorr.com/ Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:37 -07:00
Taylor Blau	3b00e35108	midx: teach `fill_midx_entry()` about incremental MIDXs In a similar fashion as previous commits, teach the `fill_midx_entry()` function to work in a incremental MIDX-aware fashion. This function, unlike others which accept an index into either the lexical order of objects or packs, takes in an object_id, and attempts to fill a caller-provided 'struct pack_entry' with the remaining pieces of information about that object from the MIDX. The function uses `bsearch_midx()` which fills out the frame-local 'pos' variable, recording the given object_id's lexical position within the MIDX chain, if found (if no matching object ID was found, we'll return immediately without filling out the `pack_entry` structure). Once given that position, we jump back through the `->base_midx` pointer to ensure that our `m` points at the MIDX layer which contains the given object_id (and not an ancestor or descendant of it in the chain). Note that we can drop the bounds check "if (pos >= m->num_objects)" because `midx_for_object()` performs this check for us. After that point, we only need to make two special considerations within this function: - First, the pack_int_id returned to us by `nth_midxed_pack_int_id()` is a position in the concatenated lexical order of packs, so we must ensure that we subtract `m->num_packs_in_base` before accessing the MIDX-local `packs` array. - Second, we must avoid translating the `pos` back to a MIDX-local index, since we use it as an argument to `nth_midxed_offset()` which expects a position relative to the concatenated lexical order of objects. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:37 -07:00
Taylor Blau	df7ede83be	midx: teach `nth_midxed_offset()` about incremental MIDXs In a similar fashion as in previous commits, teach the function `nth_midxed_offset()` about incremental MIDXs. The given object `pos` is used to find the containing MIDX, and translated back into a MIDX-local position by assigning the return value of `midx_for_object()` to it. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:37 -07:00
Taylor Blau	88f309e095	midx: teach `bsearch_midx()` about incremental MIDXs Now that the special cases callers of `bsearch_midx()` have been dealt with, teach `bsearch_midx()` to handle incremental MIDX chains. The incremental MIDX-aware version of `bsearch_midx()` works by repeatedly searching for a given OID in each layer along the `->base_midx` pointer, stopping either when an exact match is found, or the end of the chain is reached. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:37 -07:00
Taylor Blau	3f5f1cff92	midx: introduce `bsearch_one_midx()` The `bsearch_midx()` function will be extended in a following commit to search for the location of a given object ID across all MIDXs in a chain (or the single non-chain MIDX if no chain is available). While most callers will naturally want to use the updated `bsearch_midx()` function, there are a handful of special cases that will want finer control and will only want to search through a single MIDX. For instance, the object abbreviation code, which cares about object IDs near to where we'd expect to find a match in a MIDX. In that case, we want to look at the nearby matches in each layer of the MIDX chain, not just a single one). Split the more fine-grained control out into a separate function called `bsearch_one_midx()` which searches only a single MIDX. At present both `bsearch_midx()` and `bsearch_one_midx()` have identical behavior, but the following commit will rewrite the former to be aware of incremental MIDXs for the remaining non-special case callers. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:36 -07:00
Taylor Blau	60750e1eb9	midx: teach `nth_bitmapped_pack()` about incremental MIDXs In a similar fashion as in previous commits, teach the function `nth_bitmapped_pack()` about incremental MIDXs by translating the given `pack_int_id` from the concatenated lexical order to a MIDX-local lexical position. When accessing the containing MIDX's array of packs, use the local pack ID. Likewise, when reading the 'BTMP' chunk, use the MIDX-local offset when accessing the data within that chunk. (Note that the both the call to prepare_midx_pack() and the assignment of bp->pack_int_id both care about the global pack_int_id, so avoid shadowing the given 'pack_int_id' parameter). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:36 -07:00
Taylor Blau	26afb5afa1	midx: teach `nth_midxed_object_oid()` about incremental MIDXs The function `nth_midxed_object_oid()` returns the object ID for a given object position in the MIDX lexicographic order. Teach this function to instead operate over the concatenated lexicographic order defined in an earlier step so that it is able to be used with incremental MIDXs. To do this, we need to both (a) adjust the bounds check for the given 'n', as well as record the MIDX-local position after chasing the `->base_midx` pointer to find the MIDX which contains that object. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:36 -07:00
Taylor Blau	1820bd878c	midx: teach `prepare_midx_pack()` about incremental MIDXs The function `prepare_midx_pack()` is part of the midx.h API and loads the pack identified by the MIDX-local 'pack_int_id'. This patch prepares that function to be aware of an incremental MIDX world. To do this, introduce the second of the two general purpose helpers mentioned in the previous commit. This commit introduces `midx_for_pack()`, which is the pack-specific analog of `midx_for_object()`, and works in the same fashion. Like `midx_for_object()`, this function chases down the '->base_midx' field until it finds the MIDX layer within the chain that contains the given pack. Use this function within `prepare_midx_pack()` so that the `pack_int_id` it expects is now relative to the entire MIDX chain, and that it prepares the given pack in the appropriate MIDX. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:36 -07:00
Taylor Blau	19419821ba	midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs The function `nth_midxed_pack_int_id()` takes in a object position in MIDX lexicographic order and returns an identifier of the pack from which that object was selected in the MIDX. Currently, the given object position is an index into the lexicographic order of objects in a single MIDX. Change this position to instead refer into the concatenated lexicographic order of all MIDXs in a MIDX chain. This has two visible effects within the implementation of `prepare_midx_pack()`: - First, the given position is now an index into the concatenated lexicographic order of all MIDXs in the order in which they appear in the MIDX chain. - Second the pack ID returned from this function is now also in the concatenated order of packs among all layers of the MIDX chain in the same order that they appear in the MIDX chain. To do this, introduce the first of two general purpose helpers, this one being `midx_for_object()`. `midx_for_object()` takes a double pointer to a `struct multi_pack_index` as well as an object `pos` in terms of the entire MIDX chain[^1]. The function chases down the '->base_midx' field until it finds the MIDX layer within the chain that contains the given object. It then: - modifies the double pointer to point to the containing MIDX, instead of the tip of the chain, and - returns the MIDX-local position[^2] at which the given object can be found. Use this function within `nth_midxed_pack_int_id()` so that the `pos` it expects is now relative to the entire MIDX chain, and that it returns the appropriate pack position for that object. [^1]: As a reminder, this means that the object is identified among the objects contained in all layers of the incremental MIDX chain, not any particular layer. For example, consider MIDX chain with two individual MIDXs, one with 4 objects and another with 3 objects. If the MIDX with 4 objects appears earlier in the chain, then asking for object 6 would return the second object in the MIDX with 3 objects. [^2]: Building on the previous example, asking for object 6 in a MIDX chain with (4, 3) objects, respectively, this would set the double pointer to point at the MIDX containing three objects, and would return an index to the second object within that MIDX. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:36 -07:00
Taylor Blau	2678a73009	midx: add new fields for incremental MIDX chains The incremental MIDX chain feature is designed around the idea of indexing into a concatenated lexicographic ordering of object IDs present in the MIDX. When given an object position, the MIDX machinery needs to be able to locate both (a) which MIDX layer contains the given object, and (b) at what position within that MIDX layer that object appears. To do this, three new fields are added to the `struct multi_pack_index`: - struct multi_pack_index *base_midx; - uint32_t num_objects_in_base; - uint32_t num_packs_in_base; These three fields store the pieces of information suggested by their respective field names. In turn, the `num_objects_in_base` and `num_packs_in_base` fields are used to crawl backwards along the `base_midx` pointer to locate the appropriate position for a given object within the MIDX that contains it. The following commits will update various parts of the MIDX machinery (as well as their callers from outside of midx.c and midx-write.c) to be aware and make use of these fields when performing object lookups. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:36 -07:00
Taylor Blau	6eb1a7d7b0	Documentation: describe incremental MIDX format Prepare to implement incremental multi-pack indexes (MIDXs) over the next several commits by first describing the relevant prerequisites (like a new chunk in the MIDX format, the directory structure for incremental MIDXs, etc.) The format is described in detail in the patch contents below, but the high-level description is as follows. Incremental MIDXs live in $GIT_DIR/objects/pack/multi-pack-index.d, and each `*.midx` within that directory has a single "parent" MIDX, which is the MIDX layer immediately before it in the MIDX chain. The chain order resides in a file 'multi-pack-index-chain' in the same directory. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 12:01:35 -07:00
Junio C Hamano	6caa96c204	t3206: test_when_finished before dirtying operations, not after Many existing tests in this script perform operation(s) and then use test_when_finished to define how to undo the effect of the operation(s). This is backwards. When your operation(s) fail before you manage to successfully call test_when_finished (remember, that these commands must be all &&-chained, so a failure of an earlier operation mean your test_when_finished may not be executed at all). You must establish how to clean up your mess with test_when_finished before you create the mess to be cleaned up. Also make sure that the body of test_when_finished deals with case where the cruft it wants to remove failed to be created, by using "rm -f" (instead of "rm") to remove potential cruft files, and having "\|\| :" after "git notes remove" to remove potential cruft notes---both of these by default fail when asked to remove something that does not exist, instead of being silently idempotent no-ops. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 10:05:05 -07:00
Ghanshyam Thakkar	3469a23659	t: port helper/test-hashmap.c to unit-tests/t-hashmap.c helper/test-hashmap.c along with t0011-hashmap.sh test the hashmap.h library. Migrate them to the unit testing framework for better debugging, runtime performance and concise code. Along with the migration, make 'add' tests from the shell script order agnostic in unit tests, since they iterate over entries with the same keys and we do not guarantee the order. This was already done for the 'iterate' tests[1]. The helper/test-hashmap.c is still not removed because it contains a performance test meant to be run by the user directly (not used in t/perf). And it makes sense for such a utility to be a helper. [1]: `e1e7a77141` (t: sort output of hashmap iteration, 2019-07-30) Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Helped-by: Josh Steadmon <steadmon@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-06 09:25:54 -07:00
Taylor Blau	ac91586ae5	t/t7704-repack-cruft.sh: avoid failures during long-running tests On systems where running t7704.09 takes longer than 10 seconds, the test can fail. The test works by doing the following: - First write three unreachable objects, backdating the mtime for a single object ($foo) which we expect to prune. - Repack the repository into a pack containing reachable objects, and another three cruft packs, each containing one of the objects written in the previous step. - Backdate the mtimes of the cruft pack *.mtimes files themselves. (Note that this does not affect what is pruned further down in the test, but is done to ensure that the cruft packs are rewritten during that step). - Then repack with --cruft-expiration=10.seconds.ago, expecting to prune one of the three unreachable objects written in the first step. - Assert that the surviving cruft packs were rewritten, object $foo is pruned, and unreachable objects $bar, and $baz remain in the repository. If longer than 10 seconds pass between writing the three unreachable objects (the first step) and the "git repack --cruft" (the fourth step), we will mistakenly prune more objects than expected, causing the test to fail. The $foo object which we expect to prune has its mtime set back to 10,000 seconds relative to the current time, but we prune it with a cutoff of 10.seconds.ago. Instead, set the cutoff to be 1,000 seconds to give the test much longer time to run without failing. This helps platforms where running individual tests can perform slowly, on my machine this test runs much more quickly: $ hyperfine './t7704-repack-cruft.sh --run=9' Benchmark 1: ./t7704-repack-cruft.sh --run=9 Time (mean ± σ): 647.4 ms ± 30.7 ms [User: 528.5 ms, System: 124.1 ms] Range (min … max): 594.1 ms … 696.5 ms 10 runs Reported-by: Randall Becker <randall.becker@nexbridge.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-05 12:44:54 -07:00
Kyle Lippincott	ec60bb9fc4	t6421: fix test to work when repo dir contains d0 The `grep` statement in this test looks for `d0.*<string>`, attempting to filter to only show lines that had tabular output where the 2nd column had `d0` and the final column had a substring of [`git -c `]`fetch.negotiationAlgorithm`. These lines also have `child_start` in the 4th column, but this isn't part of the condition. A subsequent line will have `d1` in the 2nd column, `start` in the 4th column, and `/path/to/git/git -c fetch.negotiationAlgorihm` in the final column. If `/path/to/git/git` contains the substring `d0`, then this line is included by `grep` as well as the desired line, leading to an effective doubling of the number of lines, and test failures. Tighten the grep expression to require `d0` to be surrounded by spaces, and to have the `child_start` label. Signed-off-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-05 10:59:21 -07:00
Kyle Lippincott	b928d57ca9	set errno=0 before strtoX calls To detect conversion failure after calls to functions like `strtod`, one can check `errno == ERANGE`. These functions are not guaranteed to set `errno` to `0` on successful conversion, however. Manual manipulation of `errno` can likely be avoided by checking that the output pointer differs from the input pointer, but that's not how other locations, such as parse.c:139, handle this issue; they set errno to 0 prior to executing the function. For every place I could find a strtoX function with an ERANGE check following it, set `errno = 0;` prior to executing the conversion function. Signed-off-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-05 10:59:20 -07:00
René Scharfe	0c4d5aa22d	log-tree: use decimal_width() Reduce code duplication by calling decimal_width() to count the digits in the number of commits instead of calculating it locally. It also has the advantage of returning int, which is the exact type expected by the printf()-like function strbuf_addf() for field width arguments. Additionally, decimal_width() supports numbers bigger than 1410065407, which is (hopefully) just a theoretical advantage. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-05 08:59:40 -07:00
Sven Strickroth	e2e373ba82	refs/files: prevent memory leak by freeing packed_ref_store This complements `64a6dd8ffc` (refs: implement removal of ref storages, 2024-06-06). Signed-off-by: Sven Strickroth <email@cs-ware.de> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-05 08:58:41 -07:00
Jeff King	e95d515141	apply: canonicalize modes read from patches Git stores only canonical modes for blobs. So for a regular file, we care about only "100644" or "100755" (depending only on the executable bit), but never modes where the group or other permissions are more exotic. So never "100664", "100700", etc. When a file in the working tree has such a mode, we quietly turn it into one of the two canonical modes, and that's what is stored both in the index and in tree objects. However, we don't canonicalize modes we read from incoming patches in git-apply. These may appear in a few lines: - "old mode" / "new mode" lines for mode changes - "new file mode" lines for newly created files - "deleted file mode" for removing files For "new mode" and for "new file mode", this is harmless. The patch is asking the result to have a certain mode, but: - when we add an index entry (for --index or --cached), it is canonicalized as we create the entry, via create_ce_mode(). - for a working tree file, try_create_file() passes either 0777 or 0666 to open(), so what you get depends only on your umask, not any other bits (aside from the executable bit) in the original mode. However, for "old mode" and "deleted file mode", there is a minor annoyance. We compare the patch's expected preimage mode with the current state. But that current state is always going to be a canonical mode itself: - updating an index entry via --cached will have the canonical mode in the index - for updating a working tree file, check_preimage() runs the mode through ce_mode_from_stat(), which does the usual canonicalization So if the patch feeds a non-canonical mode, it's impossible for it to match, and we will always complain with something like: file has type 100644, expected 100664 Since this is just a warning, the operation proceeds, but it's confusing and annoying. These cases should be pretty rare in practice. Git would never produce a patch with non-canonical modes itself (since it doesn't store them). And while we do accept patches from other programs, all of those lines were invented by Git. So you'd need a program trying to be Git compatible, but not handling canonicalization the same way. Reportedly "quilt" is such a program. We should canonicalize the modes as we read them so that the user never sees the useless warning. A few notes on the tests: - I've covered instances of all lines for completeness, even though the "new mode" / "new file mode" ones behave OK currently. - the tests apply patches to both the index and working tree, and check the result of both. Again, we know that all of these paths canonicalize anyway, but it's giving us extra coverage (although we are even less likely to have such a bug now since we canonicalize up front). - the test patches are missing "index" lines, which is also something Git would never produce. But they don't matter for the test, they do match the case from quilt we saw in the wild, and they avoid some sha1/sha256 complexity. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-05 08:50:43 -07:00
Chandra Pratap	3a498b49d1	t-reftable-tree: improve the test for infix_walk() In the current testing setup for infix_walk(), the following properties of an infix traversal of a tree remain untested: - every node of the tree must be visited - every node must be visited exactly once In fact, only the property 'traversal in increasing order' is tested. Modify test_infix_walk() to check for all the properties above. This can be achieved by storing the nodes' keys linearly, in a nullified buffer, as we visit them and then checking the input keys against this buffer in increasing order. By checking that the element just after the last input key is 'NULL' in the output buffer, we ensure that every node is traversed exactly once. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-04 09:50:27 -07:00
Chandra Pratap	c70022c1b9	t-reftable-tree: add test for non-existent key In the current testing setup for tree_search(), the case for non-existent key is not exercised. Improve this by adding a test-case for the same. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-04 09:50:27 -07:00
Chandra Pratap	abf1a96773	t-reftable-tree: split test_tree() into two sub-test functions In the current testing setup, tests for both tree_search() and infix_walk() defined by reftable/tree.{c, h} are performed by a single test function, test_tree(). Split tree_test() into test_tree_search() and test_infix_walk() responsible for independently testing tree_search() and infix_walk() respectively. This improves the overall readability of the test file as well as simplifies debugging. Note that the last parameter in the tree_search() functiom is 'int insert' which when set, inserts the key if it is not found in the tree. Otherwise, the function returns NULL for such cases. While at it, use 'func' to pass function pointers and not '&func'. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-04 09:50:26 -07:00
Chandra Pratap	ec9c0704fc	t: move reftable/tree_test.c to the unit testing framework reftable/tree_test.c exercises the functions defined in reftable/tree.{c, h}. Migrate reftable/tree_test.c to the unit testing framework. Migration involves refactoring the tests to use the unit testing framework instead of reftable's test framework and renaming the tests to align with unit-tests' standards. Also add a comment to help understand the test routine. Note that this commit mostly moves the test from reftable/ to t/unit-tests/ and most of the refactoring is performed by the trailing commits. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-04 09:50:26 -07:00
Chandra Pratap	e5a0f7076f	reftable: remove unnecessary curly braces in reftable/tree.c According to Documentation/CodingGuidelines, single-line control-flow statements must omit curly braces (except for some special cases). Make reftable/tree.c adhere to this guideline. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-04 09:50:18 -07:00
Emily Shaffer	d53db106e0	Documentation: add platform support policy Supporting many platforms is only possible when we have the right tools to ensure that support. Teach platform maintainers how they can help us to help them, by explaining what kind of tooling support we would like to have, and what level of support becomes available as a result. Provide examples so that platform maintainers can see what we're asking for in practice. With this policy in place, we can make changes with stronger assurance that we are not breaking anybody we promised not to. Instead, we can feel confident that our existing testing and integration practices protect those who care from breakage. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-02 16:27:15 -07:00
Patrick Steinhardt	91d351ec88	refs: drop `ref_store`-less functions In `c8f815c208` (refs: remove functions without ref store, 2024-05-07), we have removed functions of the refs subsystem that do not take a ref store as input parameter. In order to make it easier for folks to figure out how to replace calls to such functions in in-flight patch series, we kept their definitions around in an ifdeffed block. Now that Git v2.46 is out, it is rather unlikely that anybody still has references to these old functions in their unreleased patches. Let's thus drop them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-02 08:54:32 -07:00

... 21 22 23 24 25 ...

75368 Commits