global/git

mirror of https://github.com/git/git.git synced 2026-03-12 18:09:46 +01:00

Author	SHA1	Message	Date
Junio C Hamano	10aa7c74a2	Merge branch 'gt/unit-test-oidtree' into ps/use-the-repository * gt/unit-test-oidtree: t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c	2024-06-13 09:39:46 -07:00
Junio C Hamano	092b33da2b	Merge branch 'ps/ref-storage-migration' into ps/use-the-repository * ps/ref-storage-migration: builtin/refs: new command to migrate ref storage formats refs: implement logic to migrate between ref storage formats refs: implement removal of ref storages worktree: don't store main worktree twice reftable: inline `merged_table_release()` refs/files: fix NULL pointer deref when releasing ref store refs/files: extract function to iterate through root refs refs/files: refactor `add_pseudoref_and_head_entries()` refs: allow to skip creation of reflog entries refs: pass storage format to `ref_store_init()` explicitly refs: convert ref storage format to an enum setup: unset ref storage when reinitializing repository version	2024-06-13 09:39:08 -07:00
Junio C Hamano	d63586cb31	The thirteenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-12 13:37:18 -07:00
Junio C Hamano	2a061a62e2	Merge branch 'gt/decorate-unit-test' A test helper that essentially is unit tests on the "decorate" logic has been rewritten using the unit-tests framework. * gt/decorate-unit-test: t/: migrate helper/test-example-decorate to the unit testing framework	2024-06-12 13:37:18 -07:00
Junio C Hamano	51ea70c18a	Merge branch 'jk/sparse-leakfix' Many memory leaks in the sparse-checkout code paths have been plugged. * jk/sparse-leakfix: sparse-checkout: free duplicate hashmap entries sparse-checkout: free string list after displaying sparse-checkout: free pattern list in sparse_checkout_list() sparse-checkout: free sparse_filename after use sparse-checkout: refactor temporary sparse_checkout_patterns sparse-checkout: always free "line" strbuf after reading input sparse-checkout: reuse --stdin buffer when reading patterns dir.c: always copy input to add_pattern() dir.c: free removed sparse-pattern hashmap entries sparse-checkout: clear patterns when init() sees existing sparse file dir.c: free strings in sparse cone pattern hashmaps sparse-checkout: pass string literals directly to add_pattern() sparse-checkout: free string list in write_cone_to_file()	2024-06-12 13:37:17 -07:00
Junio C Hamano	c2f79440ac	Merge branch 'jk/cap-exclude-file-size' An overly large ".gitignore" files are now rejected silently. * jk/cap-exclude-file-size: dir.c: reduce max pattern file size to 100MB dir.c: skip .gitignore, etc larger than INT_MAX	2024-06-12 13:37:17 -07:00
Junio C Hamano	b8bdb2f283	Merge branch 'jc/safe-directory-leading-path' The safe.directory configuration knob has been updated to optionally allow leading path matches. * jc/safe-directory-leading-path: safe.directory: allow "lead/ing/path/*" match	2024-06-12 13:37:16 -07:00
Junio C Hamano	22cf18fd9e	Merge branch 'gt/t-hash-unit-test' A pair of test helpers that essentially are unit tests on hash algorithms have been rewritten using the unit-tests framework. * gt/t-hash-unit-test: t/: migrate helper/test-{sha1, sha256} to unit-tests/t-hash strbuf: introduce strbuf_addstrings() to repeatedly add a string	2024-06-12 13:37:15 -07:00
Junio C Hamano	56346ba24e	Merge branch 'cp/reftable-unit-test' Basic unit tests for reftable have been reimplemented under the unit test framework. * cp/reftable-unit-test: t: improve the test-case for parse_names() t: add test for put_be16() t: move tests from reftable/record_test.c to the new unit test t: move tests from reftable/stack_test.c to the new unit test t: move reftable/basics_test.c to the unit testing framework	2024-06-12 13:37:14 -07:00
Junio C Hamano	a39e28ace7	Merge branch 'jc/t1517-more' A new test was added to ensure git commands that are designed to run outside repositories do work. * jc/t1517-more: imap-send: minimum leakfix t1517: more coverage for commands that work without repository	2024-06-12 13:37:14 -07:00
Ghanshyam Thakkar	ed54840872	t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c helper/test-oidtree.c along with t0069-oidtree.sh test the oidtree.h library, which is a wrapper around crit-bit tree. Migrate them to the unit testing framework for better debugging and runtime performance. Along with the migration, add an extra check for oidtree_each() test, which showcases how multiple expected matches can be given to check_each() helper. To achieve this, introduce a new library called 'lib-oid.h' exclusively for the unit tests to use. It currently mainly includes utility to generate object_id from an arbitrary hex string (i.e. '12a' -> '12a0000000000000000000000000000000000000'). This also handles the hash algo selection based on GIT_TEST_DEFAULT_HASH. This library will also be helpful when we port other unit tests such as oid-array, oidset etc. Helped-by: Junio C Hamano <gitster@pobox.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> [jc: small fixlets squashed in] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-12 13:33:20 -07:00
Junio C Hamano	8d94cfb545	The twelfth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-10 10:30:39 -07:00
Junio C Hamano	5235e56ea5	Merge branch 'jk/leakfixes' Memory leaks in "git mv" has been plugged. * jk/leakfixes: mv: replace src_dir with a strvec mv: factor out empty src_dir removal mv: move src_dir cleanup to end of cmd_mv() t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL t-strvec: use va_end() to match va_start()	2024-06-10 10:30:39 -07:00
Junio C Hamano	718b50e3bf	Merge branch 'iw/trace-argv-on-alias' The alias-expanded command lines are logged to the trace output. * iw/trace-argv-on-alias: run-command: show prepared command Documentation: alias: add notes on shell expansion Documentation: alias: rework notes into points	2024-06-10 10:30:38 -07:00
Junio C Hamano	1b76f06508	Merge branch 'tb/midx-write-cleanup' Code clean-up around writing the .midx files. * tb/midx-write-cleanup: pack-bitmap.c: reimplement `midx_bitmap_filename()` with helper midx: replace `get_midx_rev_filename()` with a generic helper midx-write.c: support reading an existing MIDX with `packs_to_include` midx-write.c: extract `fill_packs_from_midx()` midx-write.c: extract `should_include_pack()` midx-write.c: pass `start_pack` to `compute_sorted_entries()` midx-write.c: reduce argument count for `get_sorted_entries()` midx-write.c: tolerate `--preferred-pack` without bitmaps	2024-06-07 10:57:23 -07:00
Junio C Hamano	cd77e87115	The eleventh batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 12:49:25 -07:00
Junio C Hamano	9d8e7d2ef7	Merge branch 'mt/openindiana-scalar' Avoid removing the $(cwd) for portability. * mt/openindiana-scalar: scalar: make enlistment delete to work on all POSIX platforms	2024-06-06 12:49:25 -07:00
Junio C Hamano	df5c2c4962	Merge branch 'rs/difftool-env-simplify' Code simplification. * rs/difftool-env-simplify: difftool: add env vars directly in run_file_diff()	2024-06-06 12:49:24 -07:00
Junio C Hamano	d11b0c75ec	Merge branch 'th/quiet-lazy-fetch-from-promisor' The promisor.quiet configuration knob can be set to true to make lazy fetching from promisor remotes silent. * th/quiet-lazy-fetch-from-promisor: promisor-remote: add promisor.quiet configuration option	2024-06-06 12:49:24 -07:00
Junio C Hamano	cf792653ad	Merge branch 'ps/leakfixes' Leakfixes. * ps/leakfixes: builtin/mv: fix leaks for submodule gitfile paths builtin/mv: refactor to use `struct strvec` builtin/mv duplicate string list memory builtin/mv: refactor `add_slash()` to always return allocated strings strvec: add functions to replace and remove strings submodule: fix leaking memory for submodule entries commit-reach: fix memory leak in `ahead_behind()` builtin/credential: clear credential before exit config: plug various memory leaks config: clarify memory ownership in `git_config_string()` builtin/log: stop using globals for format config builtin/log: stop using globals for log config convert: refactor code to clarify ownership of check_roundtrip_encoding diff: refactor code to clarify memory ownership of prefixes config: clarify memory ownership in `git_config_pathname()` http: refactor code to clarify memory ownership checkout: clarify memory ownership in `unique_tracking_name()` strbuf: fix leak when `appendwholeline()` fails with EOF transport-helper: fix leaking helper name	2024-06-06 12:49:23 -07:00
Patrick Steinhardt	25a0023f28	builtin/refs: new command to migrate ref storage formats Introduce a new command that allows the user to migrate a repository between ref storage formats. This new command is implemented as part of a new git-refs(1) executable. This is due to two reasons: - There is no good place to put the migration logic in existing commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is not the correct place to put it, either. - I had it in my mind to create a new low-level command for accessing refs for quite a while already. git-refs(1) is that command and can over time grow more functionality relating to refs. This should help discoverability by consolidating low-level access to refs into a single executable. As mentioned in the preceding commit that introduces the ref storage format migration logic, the new `git refs migrate` command still has a bunch of restrictions. These restrictions are documented accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:34 -07:00
Patrick Steinhardt	6d6a3a99c7	refs: implement logic to migrate between ref storage formats With the introduction of the new "reftable" backend, users may want to migrate repositories between the backends without having to recreate the whole repository. Add the logic to do so. The implementation is generic and works with arbitrary ref storage formats so that a backend does not need to implement any migration logic. It does have a few limitations though: - We do not migrate repositories with worktrees, because worktrees have separate ref storages. It makes the overall affair more complex if we have to migrate multiple storages at once. - We do not migrate reflogs, because we have no interfaces to write many reflog entries. - We do not lock the repository for concurrent access, and thus concurrent writes may end up with weird in-between states. There is no way to fully lock the "files" backend for writes due to its format, and thus we punt on this topic altogether and defer to the user to avoid those from happening. In other words, this version is a minimum viable product for migrating a repository's ref storage format. It works alright for bare repos, which often have neither worktrees nor reflogs. But it will not work for many other repositories without some preparations. These limitations are not set into stone though, and ideally we will eventually address them over time. The logic is not yet used by anything, and thus there are no tests for it. Those will be added in the next commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:33 -07:00
Patrick Steinhardt	64a6dd8ffc	refs: implement removal of ref storages We're about to introduce logic to migrate ref storages. One part of the migration will be to delete the files that are part of the old ref storage format. We don't yet have a way to delete such data generically across ref backends though. Implement a new `delete` callback and expose it via a new `ref_storage_delete()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:33 -07:00
Patrick Steinhardt	1339cb3c47	worktree: don't store main worktree twice In `get_worktree_ref_store()` we either return the repository's main ref store, or we look up the ref store via the map of worktree ref stores. Which of these worktrees gets picked depends on the `is_current` bit of the worktree, which indicates whether the worktree is the one that corresponds to `the_repository`. The bit is getting set in `get_worktrees()`, but only after we have computed the list of all worktrees. This is too late though, because at that time we have already called `get_worktree_ref_store()` on each of the worktrees via `add_head_info()`. The consequence is that the current worktree will not have been marked accordingly, which means that we did not use the main ref store, but instead created a new ref store. We thus have two separate ref stores now that map to the same ref database. Fix this by setting `is_current` before we call `add_head_info()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:33 -07:00
Patrick Steinhardt	b5d7db9e83	reftable: inline `merged_table_release()` The function `merged_table_release()` releases a merged table, whereas `reftable_merged_table_free()` releases a merged table and then also free's its pointer. But all callsites of `merged_table_release()` are in fact followed by `reftable_merged_table_free()`, which is redundant. Inline `merged_table_release()` into `reftable_merged_table_free()` to get rid of this redundance. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:32 -07:00
Patrick Steinhardt	b3e098d6e7	refs/files: fix NULL pointer deref when releasing ref store The `free_ref_cache()` function is not `NULL` safe and will thus segfault when being passed such a pointer. This can easily happen when trying to release a partially initialized "files" ref store. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:32 -07:00
Patrick Steinhardt	120b67172f	refs/files: extract function to iterate through root refs Extract a new function that can be used to iterate through all root refs known to the "files" backend. This will be used in the next commit, where we start to teach ref backends to remove themselves. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:32 -07:00
Patrick Steinhardt	66275a6311	refs/files: refactor `add_pseudoref_and_head_entries()` The `add_pseudoref_and_head_entries()` function accepts both the ref store as well as a directory name as input. This is unnecessary though as the ref store already uniquely identifies the root directory of the ref store anyway. Furthermore, the function is misnamed now that we have clarified the meaning of pseudorefs as it doesn't add pseudorefs, but root refs. Rename it accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:32 -07:00
Patrick Steinhardt	fbd1a693c7	refs: allow to skip creation of reflog entries The ref backends do not have any way to disable the creation of reflog entries. This will be required for upcoming ref format migration logic so that we do not create any entries that didn't exist in the original ref database. Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to disable reflog entry creation. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:31 -07:00
Patrick Steinhardt	6e1683ace9	refs: pass storage format to `ref_store_init()` explicitly We're about to introduce logic to migrate refs from one storage format to another one. This will require us to initialize a ref store with a different format than the one used by the passed-in repository. Prepare for this by accepting the desired ref storage format as parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:31 -07:00
Patrick Steinhardt	318efb966b	refs: convert ref storage format to an enum The ref storage format is tracked as a simple unsigned integer, which makes it harder than necessary to discover what that integer actually is or where its values are defined. Convert the ref storage format to instead be an enum. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:31 -07:00
Patrick Steinhardt	a83f7f51e1	setup: unset ref storage when reinitializing repository version When reinitializing a repository's version we may end up unsetting the hash algorithm when it matches the default hash algorithm. If we didn't do that then the previously configured value might remain intact. While the same issue exists for the ref storage extension, we don't do this here. This has been fine for most of the part because it is not supported to re-initialize a repository with a different ref storage format anyway. We're about to introduce a new command to migrate ref storages though, so this is about to become an issue there. Prepare for this and unset the ref storage format when reinitializing a repository with the "files" format. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:31 -07:00
Jeff King	6d107751b2	sparse-checkout: free duplicate hashmap entries In insert_recursive_pattern(), we create a new pattern_entry to insert into the parent_hashmap. If we find that the same entry already exists in the hashmap, we skip adding the new one. But we forget to free the new one, creating a leak. We can fix it by cleaning up the discarded entry. It would probably be possible to avoid creating it in the first place, but it's non-trivial. We'd have to define a "keydata" struct that lets us compare the existing entries to the broken-out fields. It's probably not worth the complexity, so we'll punt on that for now. There is one subtlety here: our insertion is happening in a loop, with each iteration looking at the pattern we just inserted (hence the "recursive" in the name). So if we skip insertion, what do we look at? The obvious answer is that we should remember the existing duplicate we found and use that. But I _think_ in that case, we probably already have all of the recursive bits already (from when the original entry was added). And so just breaking out of the loop would be correct. But I'm not 100% sure on that; after all, the original leaky code could have done the same break, but it didn't. So I went with the "obvious answer" above, which has no chance of changing the behavior aside from fixing the leak. With this patch, t1091 can now be marked leak-free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-05 09:51:43 -07:00
Jeff King	a544b7da2c	sparse-checkout: free string list after displaying In sparse_checkout_list(), we put the hashmap entries into a string_list so we can sort them. But after printing, we forget to free the list. This patch drops 5 leaks from t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-05 09:51:43 -07:00
Jeff King	521e04e6e8	sparse-checkout: free pattern list in sparse_checkout_list() In sparse_checkout_list(), we create a pattern_list that needs to eventually be cleared. We remember to do so in the regular code path, but the cone-mode path does an early return, and forgets to clean up. We could fix the leak by adding a new call to clear_pattern_list(). But we can simplify even further by just skipping the early return, pushing the other code path (which consists now of only one line!) into an else block. That also matches the same cone/non-cone if/else used in some other functions. This fixes 15 leaks found in t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-05 09:51:43 -07:00
Jeff King	008f59d2d6	sparse-checkout: free sparse_filename after use We allocate a heap buffer via get_sparse_checkout_filename(). Most calls remember to free it, but sparse_checkout_init() forgets to, causing a leak. Ironically, it remembers to do so in the error return paths, but not in the path that makes it all the way to the function end! Fixing this clears up 6 leaks from t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-05 09:51:43 -07:00
Jeff King	a14d49ca84	sparse-checkout: refactor temporary sparse_checkout_patterns In update_working_directory(), we take in a pattern_list, attach it to the repository index by assigning it to index->sparse_checkout_patterns, and then call unpack_trees. Afterwards, we remove it by setting index->sparse_checkout_patterns back to NULL. But there are two possible leaks here: 1. If the index already had a populated sparse_checkout_patterns, we've obliterated it. We can fix this by saving and restoring it, rather than always setting it back to NULL. 2. We may call the function with a NULL pattern_list, expecting it to use the on-disk sparse file. In that case, the index routines will lazy-load the sparse patterns automatically. But now at the end of the function when we restore the patterns, we'll leak those lazy-loaded ones! We can fix this by freeing the pattern list before overwriting its pointer whenever it does not match what was passed in (in practice this should only happen when the passed-in list is NULL, but this is erring on the defensive side). Together these remove 48 indirect leaks found in t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-05 09:51:43 -07:00
Jeff King	d765fa0331	sparse-checkout: always free "line" strbuf after reading input In add_patterns_from_input(), we may read lines from a file with a loop like this: while (!strbuf_getline(&line, file)) { ... strbuf_to_cone_pattern(&line, pl); } /* we don't strbuf_release(&line) here! */ This generally is OK because strbuf_to_cone_pattern() consumes the buffer via strbuf_detach(). But we can leak in a few cases: 1. We don't always consume the buffer! If the line ends up empty after trimming, we leave strbuf_to_cone_pattern() without detaching. In most cases this is OK, because a subsequent getline() call will use the same buffer. But if you had an empty line at the end of file, for example, it would leak. 2. Even if strbuf_to_cone_pattern() always consumed the buffer, there's a subtle issue with strbuf_getline(). As we saw in `94e2aa555e` (strbuf: fix leak when `appendwholeline()` fails with EOF, 2024-05-27), it's possible for it to return EOF with an allocated buffer (e.g., if the underlying getdelim() call saw an error). So we should always strbuf_release() after finishing a read loop like this. Note that even the code to read patterns from argv has the same problem. Because that also uses strbuf_to_cone_pattern(), we stuff each argv entry into a strbuf. It uses the same "line" strbuf as the getline code, but we should position the strbuf_release() to cover both code paths. This fixes at least 9 leaks found in t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-05 09:51:43 -07:00
Jeff King	c3324649ed	sparse-checkout: reuse --stdin buffer when reading patterns When we read patterns from --stdin, we loop on strbuf_getline(), and detach each line we read to pass into add_pattern(). This used to be necessary because add_pattern() required that the pattern strings remain valid while the pattern_list was in use. But it also created a leak, since we didn't record the detached buffers anywhere else. Now that add_pattern() has been modified to make its own copy of the strings, we can stop detaching and fix the leak. This fixes 4 leaks detected in t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-05 09:51:42 -07:00
Jeff King	eed1fbe73b	dir.c: always copy input to add_pattern() The add_pattern() function has a subtle and undocumented gotcha: the pattern string you pass in must remain valid as long as the pattern_list is in use (and nor do we take ownership of it). This is easy to get wrong, causing either subtle bugs (because you free or reuse the string buffer) or leaks (because you copy the string, but don't track ownership separately). All of this "pattern" code was originally the "exclude" mechanism. So this _usually_ works OK because you add entries in one of two ways: 1. From the command-line (e.g., "--exclude"), in which case we're pointing to an argv entry which remains valid for the lifetime of the program. 2. From a file (e.g., ".gitignore"), in which case we read the whole file into a buffer, attach it to the pattern_list's "filebuf" entry, then parse the buffer in-place (adding NULs). The strings point into the filebuf, which is cleaned up when the whole pattern_list goes away. But other code, like sparse-checkout, reads individual lines from stdin and passes them one by one to add_pattern(), leaking each. We could fix this by refactoring it to take in the whole buffer at once, like (2) above, and stuff it in "filebuf". But given how subtle the interface is, let's just fix it to always copy the string. That seems at first like we'd be wasting extra memory, but we can mitigate that: a. The path_pattern struct already uses a FLEXPTR, since we sometimes make a copy (when we see "foo/", we strip off the trailing slash, requiring a modifiable copy of the string). Since we'll now always embed the string inside the struct, we can switch to the regular FLEX_ARRAY pattern, saving us 8 bytes of pointer. So patterns with a trailing slash and ones under 8 bytes actually get smaller. b. Now that we don't need the original string to hang around, we can get rid of the "filebuf" mechanism entirely, and just free the file contents after parsing. Since files are the sources we'd expect to have the largest pattern sets, we should mostly break even on stuffing the same data into the individual structs. This patch just adjusts the add_pattern() interface; it doesn't fix any leaky callers yet. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-05 09:51:42 -07:00
Jeff King	e7c3d1ddba	dir.c: reduce max pattern file size to 100MB In `a2bc523e1e` (dir.c: skip .gitignore, etc larger than INT_MAX, 2024-05-31) we put capped the size of some files whose parsing code and data structures used ints. Setting the limit to INT_MAX was a natural spot, since we know the parsing code would misbehave above that. But it also leaves the possibility of overflow errors when we multiply that limit to allocate memory. For instance, a file consisting only of "a\na\n..." could have INT_MAX/2 entries. Allocating an array of pointers for each would need INT_MAX*4 bytes on a 64-bit system, enough to overflow a 32-bit int. So let's give ourselves a bit more safety margin by giving a much smaller limit. The size 100MB is somewhat arbitrary, but is based on the similar value for attribute files added by `3c50032ff5` (attr: ignore overly large gitattributes files, 2022-12-01). There's no particular reason these have to be the same, but the idea is that they are in the ballpark of "so huge that nobody would care, but small enough to avoid malicious overflow". So lacking a better guess, it makes sense to use the same value. The implementation here doesn't share the same constant, but we could change that later (or even give it a runtime config knob, though nobody has complained yet about the attribute limit). And likewise, let's add a few tests that exercise the limits, based on the attr ones. In this case, though, we never read .gitignore from the index; the blob code is exercised only for sparse filters. So we'll trigger it that way. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-05 09:23:42 -07:00
Junio C Hamano	56f4f4a29d	imap-send: minimum leakfix EVen with the minimum "no-op" invocation t1517 makes, "git imap-send" leaks an empty strbuf it used to read a 0-byte string into. There are a few other topics cooking in 'next' that plugs many other leaks in this program, so let's minimally fix this one, barely enough to make CI pass, leaving the rest for the other topic. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-04 11:48:20 -07:00
Jeff King	4c844c2f49	dir.c: free removed sparse-pattern hashmap entries In add_pattern_to_hashsets(), we remove entries from the recursive_hashmap when adding similar ones to the parent_hashmap. I won't pretend to understand all of what's going on here, but there's an obvious leak: whatever we removed from recursive_hashmap is not referenced anywhere else, and is never free()d. We can easily fix this by asking the hashmap to return a pointer to the old entry. This makes t7002 now completely leak-free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-04 10:38:23 -07:00
Jeff King	db83b64cda	sparse-checkout: clear patterns when init() sees existing sparse file In sparse_checkout_init(), we first try to load patterns from an existing file. If we found any, we return immediately, but end up leaking the patterns we parsed. Fixing this reduces the number of leaks in t7002 from 9 down to 5. Note that there are two other exits from the function, but they don't need the same treatment: - if we can't resolve HEAD, we write out a hard-coded sparse file and return. But we know the pattern list is empty there, since we didn't find any in the on-disk file and we haven't yet added any of our own. - otherwise, we do populate the list and then tail-call into write_patterns_and_update(). But that function frees the pattern_list itself, so we don't need to. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-04 10:38:23 -07:00
Jeff King	4318d3ab65	dir.c: free strings in sparse cone pattern hashmaps The pattern_list structs used for cone-mode sparse lookups use a few extra hashmaps. These store pattern_entry structs, each of which has its own heap-allocated pattern string. When we clean up the hashmaps, we free the individual pattern_entry structs, but forget to clean up the embedded strings, causing memory leaks. We can fix this by iterating over the hashmaps to free the extra strings. This reduces the numbers of leaks in t7002 from 22 to 9. One alternative here would be to make the string a FLEX_ARRAY member of the pattern_entry. Then there's no extra free() required, and as a bonus it would be a little more efficient. However, some of the refactoring gets awkward, as we are often assigning strings allocated by helper functions. So let's just fix the leak for now, and we can explore bigger refactoring separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-04 10:38:23 -07:00
Jeff King	4d7f95ed1f	sparse-checkout: pass string literals directly to add_pattern() The add_pattern() function takes a pattern string, but neither makes a copy of it nor takes ownership of the memory. So it is the caller's responsibility to make sure the string hangs around as long as the pattern_list which references it. There are a few cases in sparse-checkout where we use string literal patterns by stuffing them into a strbuf, detaching the buffer, and then passing the result into add_pattern(). This creates a leak when the pattern_list is eventually cleared, since we don't retain a copy of the detached buffer to free. But we can observe that the whole strbuf dance is unnecessary. The point was presumably[1] to satisfy the lifetime requirement of the string. But string literals have static duration; we can count on them lasting for the whole program. So we can fix the leak by just passing them directly. And as a bonus, that simplifies the code. The leaks can be seen in t7002, which drops from 25 leaks to 22 with this patch. It also makes t3602 and t1090 leak-free. In the long run, we will also want to clean up this (undocumented!) memory lifetime requirement of add_pattern(). But that can come in a later patch; passing the string literals directly will be the right thing either way. [1] The code in question comes from `416adc8711` (sparse-checkout: update working directory in-process for 'init', 2019-11-21) and `99dfa6f970` (sparse-checkout: use in-process update for disable subcommand, 2019-11-21), but I didn't see anything in their commit messages or on the list explaining the strbufs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-04 10:38:23 -07:00
Jeff King	2181fe6e46	sparse-checkout: free string list in write_cone_to_file() We use a string list to hold sorted and de-duped patterns, but don't free it before leaving the function, causing a leak. This drops the number of leaks found in t7002 from 27 to 25. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-04 10:38:22 -07:00
Junio C Hamano	7b0defb391	The tenth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-03 13:14:52 -07:00
Junio C Hamano	eb6392fb4f	Merge branch 'th/push-local-ff-check-without-lazy-fetch' When "git push" notices that the commit at the tip of the ref on the other side it is about to overwrite does not exist locally, it used to first try fetching it if the local repository is a partial clone. The command has been taught not to do so and immediately fail instead. * th/push-local-ff-check-without-lazy-fetch: push: don't fetch commit object when checking existence	2024-06-03 13:11:12 -07:00
Junio C Hamano	5c7c063c1f	Merge branch 'ps/fix-reinit-includeif-onbranch' "git init" in an already created directory, when the user configuration has includeif.onbranch, started to fail recently, which has been corrected. * ps/fix-reinit-includeif-onbranch: setup: fix bug with "includeIf.onbranch" when initializing dir	2024-06-03 13:11:11 -07:00

1 2 3 4 5 ...

73632 Commits