global/git

mirror of https://github.com/git/git.git synced 2026-01-09 17:46:37 +00:00

Author	SHA1	Message	Date
Patrick Steinhardt	ba620d296a	reftable/block: simplify how we track restart points Restart points record the location of reftable records that do not use prefix compression and are used to perform a binary search inside of a block. These restart points are encoded at the end of a block, between the record data and the footer of a table. The block structure contains three different variables related to these restart points: - The block length contains the length of the reftable block up to the restart points. - The restart count contains the number of restart points contained in the block. - The restart bytes variable tracks where the restart point data begins. Tracking all three of these variables is unnecessary though as the data can be derived from one another: the block length without restart points is the exact same as the offset of the restart count data, which we already track via the `restart_bytes` data. Refactor the code so that we track the location of restart bytes not as a pointer, but instead as an offset. This allows us to trivially get rid of the `block_len` variable as described above. This avoids having the confusing `block_len` variable and allows us to do less bookkeeping overall. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:53:09 -07:00
Patrick Steinhardt	1ac4e5e83d	reftable/blocksource: consolidate code into a single file The code that implements block sources is distributed across a couple of files. Consolidate all of it into "reftable/blocksource.c" and its accompanying header so that it is easier to locate and more self contained. While at it, rename some of the functions to have properly scoped names. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:53:09 -07:00
Patrick Steinhardt	b648bd6549	reftable/reader: rename data structure to "table" The `struct reftable_reader` subsystem encapsulates a table that has been read from the disk. As such, the current name of that structure is somewhat hard to understand as it only talks about the fact that we read something from disk, without really giving an indicator _what_ that is. Furthermore, this naming schema doesn't really fit well into how the other structures are named: `reftable_merged_table`, `reftable_stack`, `reftable_block` and `reftable_record` are all named after what they encapsulate. Rename the subsystem to `reftable_table`, which directly gives a hint that the data structure is about handling the individual tables part of the stack. While this change results in a lot of churn, it prepares for us exposing the APIs to third-party callers now that the reftable library is a standalone library that can be linked against by other projects. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:53:09 -07:00
Patrick Steinhardt	6dcc05ffc3	reftable: fix formatting of the license header The license headers used across the reftable library doesn't follow our typical coding style for multi-line comments. Fix it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:53:09 -07:00
Junio C Hamano	c7c4e5e419	Merge branch 'ps/reftable-sans-compat-util' into ps/reftable-api-revamp * ps/reftable-sans-compat-util: Makefile: skip reftable library for Coccinelle reftable: decouple from Git codebase by pulling in "compat/posix.h" git-compat-util.h: split out POSIX-emulating bits compat/mingw: split out POSIX-related bits reftable/basics: introduce `REFTABLE_UNUSED` annotation reftable/basics: stop using `SWAP()` macro reftable/stack: stop using `sleep_millisec()` reftable/system: introduce `reftable_rand()` reftable/reader: stop using `ARRAY_SIZE()` macro reftable/basics: provide wrappers for big endian conversion reftable/basics: stop using `st_mult()` in array allocators reftable: stop using `BUG()` in trivial cases reftable/record: don't `BUG()` in `reftable_record_cmp()` reftable/record: stop using `BUG()` in `reftable_record_init()` reftable/record: stop using `COPY_ARRAY()` reftable/blocksource: stop using `xmmap()` reftable/stack: stop using `write_in_full()` reftable/stack: stop using `read_in_full()`	2025-04-01 19:05:13 +09:00
René Scharfe	bad7910399	reftable: release name on reftable_reader_new() error If block_source_read_block() or parse_footer() fail, we leak the "name" member of struct reftable_reader in reftable_reader_new(). Release it. Reported by: H Z <shiyuyuranzh@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-04 09:21:39 -08:00
Patrick Steinhardt	6af23ac66c	reftable: decouple from Git codebase by pulling in "compat/posix.h" The reftable library includes "git-compat-util.h" in order to get a POSIX-like programming environment that papers over various differences between platforms. The header also brings with it a couple of helpers specific to the Git codebase though, and over time we have started to use these helpers in the reftable library, as well. This makes it very hard to use the reftable library as a standalone library without the rest of the Git codebase, so other libraries like e.g. libgit2 cannot easily use it. But now that we have removed all calls to Git-specific functionality and have split out "compat/posix.h" as a separate header we can address this. Stop including "git-compat-util.h" and instead include "compat/posix.h" to finalize the decoupling of the reftable library from the rest of the Git codebase. The only bits which remain specific to Git are "system.h" and "system.c", which projects will have to provide. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:41 -08:00
Patrick Steinhardt	f93b2a0424	reftable/basics: introduce `REFTABLE_UNUSED` annotation Introduce the `REFTABLE_UNUSED` annotation and replace all existing users of `UNUSED` in the reftable library to use the new macro instead. Note that we unconditionally define `MAYBE_UNUSED` in the exact same way, so doing so unconditionally for `REFTABLE_UNUSED` should be fine, too. Suggested-by: Toon Claes <toon@iotcl.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:38 -08:00
Patrick Steinhardt	f8ed12dec4	reftable/basics: stop using `SWAP()` macro Stop using `SWAP()` macro in favor of an open-coded variant of it. Note that this also requires us to open-code the build assert that `SWAP()` itself uses to verify that the size of both variables matches. This is done to reduce our dependency on the Git codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:38 -08:00
Patrick Steinhardt	10f2935c7f	reftable/stack: stop using `sleep_millisec()` Refactor our use of `sleep_millisec()` by open-coding it with poll(3p), which is the current implementation of this function. Ideally, we'd use a more direct way to sleep, but there is no equivalent to sleep(3p) that would accept milliseconds as input. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:38 -08:00
Patrick Steinhardt	712f6cfe54	reftable/system: introduce `reftable_rand()` Introduce a new system-level `reftable_rand()` function that generates a single unsigned integer for us. The implementation of this function is to be provided by the calling codebase, which allows us to more easily hook into pre-seeded random number generators. Adapt the two callsites where we generated random data. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:38 -08:00
Patrick Steinhardt	01a587da8c	reftable/reader: stop using `ARRAY_SIZE()` macro We have a single user of the `ARRAY_SIZE()` macro in the reftable reader. Drop its use to reduce our dependence on the Git codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:37 -08:00
Patrick Steinhardt	e676694298	reftable/basics: provide wrappers for big endian conversion We're using a mixture of big endian conversion functions provided by both the reftable library, but also by the Git codebase. Refactor the code so that we exclusively use reftable-provided wrappers in order to untangle us from the Git codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:37 -08:00
Patrick Steinhardt	6e3ea71639	reftable/basics: stop using `st_mult()` in array allocators We're using `st_mult()` as part of our macro helpers that allocate arrays. This is bad due two two reasons: - `st_mult()` causes us to die in case the multiplication overflows. - `st_mult()` ties us to the Git codebase. Refactor the code to instead detect overflows manually and return an error in such cases. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:37 -08:00
Patrick Steinhardt	445f9f4f35	reftable: stop using `BUG()` in trivial cases Stop using `BUG()` in the remaining trivial cases that we still have in the reftable library. Instead of aborting the program, we'll now bubble up a `REFTABLE_API_ERROR` to indicate misuse of the calling conventions. Note that in both `reftable_reader_{inc,dec}ref()` we simply stop calling `BUG()` altogether. The only situation where the counter should be zero is when the structure has already been free'd anyway, so we would run into undefined behaviour regardless of whether we try to abort the program or not. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:36 -08:00
Patrick Steinhardt	6f6127decd	reftable/record: don't `BUG()` in `reftable_record_cmp()` The reftable library aborts with a bug in case `reftable_record_cmp()` is invoked with two records of differing types. This would cause the program to die without the caller being able to handle the error, which is not something we want in the context of library code. And it ties us to the Git codebase. Refactor the code such that `reftable_record_cmp()` returns an error code separate from the actual comparison result. This requires us to also adapt some callers up the callchain in a similar fashion. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:36 -08:00
Patrick Steinhardt	9d9fac0f34	reftable/record: stop using `BUG()` in `reftable_record_init()` We're aborting the program via `BUG()` in case `reftable_record_init()` was invoked with an unknown record type. This is bad because we may now die in library code, and because it makes us depend on the Git codebase. Refactor the code such that `reftable_record_init()` can return an error code to the caller. Adapt any callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:36 -08:00
Patrick Steinhardt	a967966432	reftable/record: stop using `COPY_ARRAY()` Drop our use of `COPY_ARRAY()`, replacing it with an open-coded variant thereof. This is done to reduce our dependency on the Git library. While at it, guard the whole array copy logic so that we only copy it in case there actually is anything to be copied. Otherwise, we may end up trying to allocate a zero-sized array, which will return a NULL pointer and thus cause us to return an `REFTABLE_OUT_OF_MEMORY_ERROR`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:35 -08:00
Patrick Steinhardt	70afa6fa31	reftable/blocksource: stop using `xmmap()` We use `xmmap()` to map reftables into memory. This function has two problems: - It causes us to die in case the mmap fails. - It ties us to the Git codebase. Refactor the code to use mmap(3p) instead with manual error checking. Note that this function may not be the system-provided mmap(3p), but may point to our `git_mmap()` wrapper that emulates the syscall on systems that do not have mmap(3p) available. Fix `reftable_block_source_from_file()` to properly bubble up the error code in case the map(3p) call fails. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:35 -08:00
Patrick Steinhardt	e31db89558	reftable/stack: stop using `write_in_full()` Similar to the preceding commit, drop our use of `write_in_full()` and implement a new wrapper `reftable_write_full()` that handles this logic for us. This is done to reduce our dependency on the Git library. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:35 -08:00
Patrick Steinhardt	cb3e368b69	reftable/stack: stop using `read_in_full()` There is a single callsite of `read_in_full()` in the reftable library. Open-code the function to reduce our dependency on the Git library. Note that we only partially port over the logic from `read_in_full()` and its underlying `xread()` helper. Most importantly, the latter also knows to handle `EWOULDBLOCK` via `handle_nonblock()`. This logic is irrelevant for us though because the reftable library never sets the `O_NONBLOCK` option in the first place. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-02-18 10:55:35 -08:00
Junio C Hamano	82522a9e2c	Merge branch 'kn/reflog-migration-fix-followup' Code clean-up. * kn/reflog-migration-fix-followup: reftable: prevent 'update_index' changes after adding records refs: use 'uint64_t' for 'ref_update.index' refs: mark `ref_transaction_update_reflog()` as static	2025-02-14 17:53:48 -08:00
Junio C Hamano	9d0e81e2ae	Merge branch 'ps/zlib-ng' The code paths to interact with zlib has been cleaned up in preparation for building with zlib-ng. * ps/zlib-ng: ci: make "linux-musl" job use zlib-ng ci: switch linux-musl to use Meson compat/zlib: allow use of zlib-ng as backend git-zlib: cast away potential constness of `next_in` pointer compat/zlib: provide stubs for `deflateSetHeader()` compat/zlib: provide `deflateBound()` shim centrally git-compat-util: move include of "compat/zlib.h" into "git-zlib.h" compat: introduce new "zlib.h" header git-compat-util: drop `z_const` define compat: drop `uncompress2()` compatibility shim	2025-02-06 14:56:45 -08:00
Patrick Steinhardt	41f1a8435a	git-compat-util: move include of "compat/zlib.h" into "git-zlib.h" We include "compat/zlib.h" in "git-compat-util.h", which is unnecessarily broad given that we only have a small handful of files that use the zlib library. Move the header into "git-zlib.h" instead and adapt users of zlib to include that header. One exception is the reftable library, as we don't want to use the Git-specific wrapper of zlib there, so we include "compat/zlib.h" instead. Furthermore, we move the include into "reftable/system.h" so that users of the library other than Git can wire up zlib themselves. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-28 13:03:22 -08:00
Patrick Steinhardt	629188ede7	compat: introduce new "zlib.h" header Introduce a new "compat/zlib-compat.h" header that we include instead of including <zlib.h> directly. This will allow us to wire up zlib-ng as an alternative backend for zlib compression in a subsequent commit. Note that we cannot just call the file "compat/zlib.h", as that may otherwise cause us to include that file instead of <zlib.h>. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-28 13:03:22 -08:00
Junio C Hamano	a17fd7dd3a	Merge branch 'ps/reftable-sign-compare' The reftable/ library code has been made -Wsign-compare clean. * ps/reftable-sign-compare: reftable: address trivial -Wsign-compare warnings reftable/blocksource: adjust `read_block()` to return `ssize_t` reftable/blocksource: adjust type of the block length reftable/block: adjust type of the restart length reftable/block: adapt header and footer size to return a `size_t` reftable/basics: adjust `hash_size()` to return `uint32_t` reftable/basics: adjust `common_prefix_size()` to return `size_t` reftable/record: handle overflows when decoding varints reftable/record: drop unused `print` function pointer meson: stop disabling -Wsign-compare	2025-01-28 13:02:24 -08:00
Karthik Nayak	017bd89239	reftable: prevent 'update_index' changes after adding records The function `reftable_writer_set_limits()` allows updating the 'min_update_index' and 'max_update_index' of a reftable writer. These values are written to both the writer's header and footer. Since the header is written during the first block write, any subsequent changes to the update index would create a mismatch between the header and footer values. The footer would contain the newer values while the header retained the original ones. To protect against this bug, prevent callers from updating these values after any record is written. To do this, modify the function to return an error whenever the limits are modified after any record adds. Check for record adds within `reftable_writer_set_limits()` by checking the `last_key` and `next` variable. The former is updated after each record added, but is reset at certain points. The latter is set after writing the first block. Modify all callers of the function to anticipate a return type and handle it accordingly. Add a unit test to also ensure the function returns the error as expected. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-22 09:51:36 -08:00
Patrick Steinhardt	33319b0976	reftable: address trivial -Wsign-compare warnings Address the last couple of trivial -Wsign-compare warnings in the reftable library and remove the DISABLE_SIGN_COMPARE_WARNINGS macro that we have in "reftable/system.h". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-21 14:20:30 -08:00
Patrick Steinhardt	7c4c1cbc0b	reftable/blocksource: adjust `read_block()` to return `ssize_t` The `block_source_read_block()` function and its implementations return an integer as a result that reflects either the number of bytes read, or an error. As such its return type, a signed integer, isn't wrong, but it doesn't give the reader a good hint what it actually returns. Refactor the function to return an `ssize_t` instead, which is typical for functions similar to read(3p) and should thus give readers a better signal what they can expect as a result. Adjust callers to better handle the returned value to avoid warnings with -Wsign-compare. One of these callers is `reader_get_block()`, whose return value is only ever used by its callers to figure out whether or not the read was successful. So instead of bubbling up the `ssize_t` there, too, we adapt it to only indicate success or errors. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-21 14:20:30 -08:00
Patrick Steinhardt	1f054af72f	reftable/blocksource: adjust type of the block length The block length is used to track the number of bytes available in a specific block. As such, it is never set to a negative value, but is still represented by a signed integer. Adjust the type of the variable to be `size_t`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-21 14:20:30 -08:00
Patrick Steinhardt	b1e4b6f4dc	reftable/block: adjust type of the restart length The restart length is tracked as a positive integer even though it cannot ever be negative. Furthermore, it is effectively capped via the MAX_RESTARTS variable. Adjust the type of the variable to be `uint32_t`. While this type is excessive given that MAX_RESTARTS fits into an `uint16_t`, other places already use 32 bit integers for restarts, so this type is being more consistent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-21 14:20:30 -08:00
Patrick Steinhardt	ffe6643668	reftable/block: adapt header and footer size to return a `size_t` The functions `header_size()` and `footer_size()` return a positive integer representing the size of the header and footer, respectively, dependent on the version of the reftable format. Similar to the preceding commit, these functions return a signed integer though, which is nonsensical given that there is no way for these functions to return negative. Adapt the functions to return a `size_t` instead to fix a couple of sign comparison warnings. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-21 14:20:29 -08:00
Patrick Steinhardt	57adf71b93	reftable/basics: adjust `hash_size()` to return `uint32_t` The `hash_size()` function returns the number of bytes used by the hash function. Weirdly enough though, it returns a signed integer for its size even though the size obviously cannot ever be negative. The only case where it could be negative is if the function returned an error when asked for an unknown hash, but we assert(3p) instead. Adjust the type of `hash_size()` to be `uint32_t` and adapt all places that use signed integers for the hash size to follow suit. This also allows us to get rid of a couple asserts that we had which verified that the size was indeed positive, which further stresses the point that this refactoring makes sense. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-21 14:20:29 -08:00
Patrick Steinhardt	5ac65f0d6b	reftable/basics: adjust `common_prefix_size()` to return `size_t` The `common_prefix_size()` function computes the length of the common prefix between two buffers. As such its return value will always be an unsigned integer, as the length cannot be negative. Regardless of that, the function returns a signed integer, which is nonsensical and causes a couple of -Wsign-compare warnings all over the place. Adjust the function to return a `size_t` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-21 14:20:29 -08:00
Patrick Steinhardt	072e3aa3a5	reftable/record: handle overflows when decoding varints The logic to decode varints isn't able to detect integer overflows: as long as the buffer still has more data available, and as long as the current byte has its 0x80 bit set, we'll continue to add up these values to the result. This will eventually cause the `uint64_t` to overflow, at which point we'll return an invalid result. Refactor the function so that it is able to detect such overflows. The implementation is basically copied from Git's own `decode_varint()`, which already knows to handle overflows. The only adjustment is that we also take into account the string view's length in order to not overrun it. The reftable documentation explicitly notes that those two encoding schemas are supposed to be the same: Varint encoding ^^^^^^^^^^^^^^^ Varint encoding is identical to the ofs-delta encoding method used within pack files. Decoder works as follows: .... val = buf[ptr] & 0x7f while (buf[ptr] & 0x80) { ptr++ val = ((val + 1) << 7) \| (buf[ptr] & 0x7f) } .... While at it, refactor `put_var_int()` in the same way by copying over the implementation of `encode_varint()`. While `put_var_int()` doesn't have an issue with overflows, it generates warnings with -Wsign-compare. The implementation of `encode_varint()` doesn't, is battle-tested and at the same time way simpler than what we currently have. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-21 14:20:28 -08:00
Patrick Steinhardt	a204f92d1c	reftable/record: drop unused `print` function pointer In `42c424d69d` (t/helper: inline printing of reftable records, 2024-08-22) we stopped using the `print` function of the reftable record vtable and instead moved its implementation into the single user of it. We didn't remove the function itself from the vtable though. Drop it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-21 14:20:28 -08:00
Patrick Steinhardt	0b4f8afef6	reftable/stack: accept insecure random bytes The reftable library uses randomness in two call paths: - When reading a stack in case some of the referenced tables disappears. The randomness is used to delay the next read by a couple of milliseconds. - When writing a new table, where the randomness gets appended to the table name (e.g. "0x000000000001-0x000000000002-0b1d8ddf.ref"). In neither of these cases do we need strong randomness. Unfortunately though, we have observed test failures caused by the former case. In t0610 we have a test that spawns a 100 processes at once, all of which try to write a new table to the stack. And given that all of the processes will require randomness, it can happen that these processes make the entropy pool run dry, which will then cause us to die: + test_seq 100 + printf %s commit\trefs/heads/branch-%s\n 68d032e9edd3481ac96382786ececc37ec28709e 1 + printf %s commit\trefs/heads/branch-%s\n 68d032e9edd3481ac96382786ececc37ec28709e 2 ... + git update-ref refs/heads/branch-98 HEAD + git update-ref refs/heads/branch-97 HEAD + git update-ref refs/heads/branch-99 HEAD + git update-ref refs/heads/branch-100 HEAD fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes fatal: unable to get random bytes The report was for NonStop, which uses OpenSSL as the backend for randomness. In the preceding commit we have adapted that backend to also return randomness in case the entropy pool is empty and the caller passes the `CSPRNG_BYTES_INSECURE` flag. Do so to fix the issue. Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 09:04:18 -08:00
Patrick Steinhardt	1568d1562e	wrapper: allow generating insecure random bytes The `csprng_bytes()` function generates randomness and writes it into a caller-provided buffer. It abstracts over a couple of implementations, where the exact one that is used depends on the platform. These implementations have different guarantees: while some guarantee to never fail (arc4random(3)), others may fail. There are two significant failures to distinguish from one another: - Systemic failure, where e.g. opening "/dev/urandom" fails or when OpenSSL doesn't have a provider configured. - Entropy failure, where the entropy pool is exhausted, and thus the function cannot guarantee strong cryptographic randomness. While we cannot do anything about the former, the latter failure can be acceptable in some situations where we don't care whether or not the randomness can be predicted. Introduce a new `CSPRNG_BYTES_INSECURE` flag that allows callers to opt into weak cryptographic randomness. The exact behaviour of the flag depends on the underlying implementation: - `arc4random_buf()` never returns an error, so it doesn't change. - `getrandom()` pulls from "/dev/urandom" by default, which never blocks on modern systems even when the entropy pool is empty. - `getentropy()` seems to block when there is not enough randomness available, and there is no way of changing that behaviour. - `GtlGenRandom()` doesn't mention anything about its specific failure mode. - The fallback reads from "/dev/urandom", which also returns bytes in case the entropy pool is drained in modern Linux systems. That only leaves OpenSSL with `RAND_bytes()`, which returns an error in case the returned data wouldn't be cryptographically safe. This function is replaced with a call to `RAND_pseudo_bytes()`, which can indicate whether or not the returned data is cryptographically secure via its return value. If it is insecure, and if the `CSPRNG_BYTES_INSECURE` flag is set, then we ignore the insecurity and return the data regardless. It is somewhat questionable whether we really need the flag in the first place, or whether we wouldn't just ignore the potentially-insecure data. But the risk of doing that is that we might have or grow callsites that aren't aware of the potential insecureness of the data in places where it really matters. So using a flag to opt-in to that behaviour feels like the more secure choice. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-07 09:04:18 -08:00
René Scharfe	e4981ed1e7	reftable: handle realloc error in parse_names() Check the final reallocation for adding the terminating NULL and handle it just like those in the loop. Simply use REFTABLE_ALLOC_GROW instead of keeping the REFTABLE_REALLOC_ARRAY call and adding code to preserve the original pointer value around it. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-28 08:00:44 -08:00
René Scharfe	2cca185e85	reftable: fix allocation count on realloc error When realloc(3) fails, it returns NULL and keeps the original allocation intact. REFTABLE_ALLOC_GROW overwrites both the original pointer and the allocation count variable in that case, simultaneously leaking the original allocation and misrepresenting the number of storable items. parse_names() avoids the leak by keeping the original pointer if reallocation fails, but still increase the allocation count in such a case as if it succeeded. That's OK, because the error handling code just frees everything and doesn't look at names_cap anymore. reftable_buf_add() does the same, but here it is a problem as it leaves the reftable_buf in a broken state, with ->alloc being roughly twice as big as the actually allocated memory, allowing out-of-bounds writes in subsequent calls. Reimplement REFTABLE_ALLOC_GROW to avoid leaks, keep allocation counts in sync and still signal failures to callers while avoiding code duplication in callers. Make it an expression that evaluates to 0 if no reallocation is needed or it succeeded and 1 on failure while keeping the original pointer and allocation counter values. Adjust REFTABLE_ALLOC_GROW_OR_NULL to the new calling convention for REFTABLE_ALLOC_GROW, but keep its support for non-size_t alloc variables for now. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-28 08:00:44 -08:00
René Scharfe	8db127d43f	reftable: avoid leaks on realloc error When realloc(3) fails, it returns NULL and keeps the original allocation intact. REFTABLE_ALLOC_GROW overwrites both the original pointer and the allocation count variable in that case, simultaneously leaking the original allocation and misrepresenting the number of storable items. parse_names() and reftable_buf_add() avoid leaking by restoring the original pointer value on failure, but all other callers seem to be OK with losing the old allocation. Add a new variant of the macro, REFTABLE_ALLOC_GROW_OR_NULL, which plugs the leak and zeros the allocation counter. Use it for those callers. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-28 08:00:44 -08:00
Junio C Hamano	4156b6a741	Merge branch 'ps/build-sign-compare' Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings	2024-12-23 09:32:11 -08:00
Junio C Hamano	f7c607fac3	Merge branch 'kn/reftable-writer-log-write-verify' Reftable backend adds check for upper limit of log's update_index. * kn/reftable-writer-log-write-verify: reftable/writer: ensure valid range for log's update_index	2024-12-23 09:32:08 -08:00
Junio C Hamano	3151e6a121	Merge branch 'ps/reftable-alloc-failures-zalloc-fix' Recent reftable updates mistook a NULL return from a request for 0-byte allocation as OOM and died unnecessarily, which has been corrected. * ps/reftable-alloc-failures-zalloc-fix: reftable/basics: return NULL on zero-sized allocations reftable/stack: fix zero-sized allocation when there are no readers reftable/merged: fix zero-sized allocation when there are no readers reftable/stack: don't perform auto-compaction with less than two tables	2024-12-23 09:32:06 -08:00
Patrick Steinhardt	d7282891f5	reftable/basics: return NULL on zero-sized allocations In the preceding commits we have fixed a couple of issues when allocating zero-sized objects. These issues were masked by implementation-defined behaviour. Quoting malloc(3p): If size is 0, either: * A null pointer shall be returned and errno may be set to an implementation-defined value, or * A pointer to the allocated space shall be returned. The application shall ensure that the pointer is not used to access an object. So it is perfectly valid that implementations of this function may or may not return a NULL pointer in such a case. Adapt both `reftable_malloc()` and `reftable_realloc()` so that they return NULL pointers on zero-sized allocations. This should remove any implementation-defined behaviour in our allocators and thus allows us to detect such platform-specific issues more easily going forward. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-22 00:58:23 -08:00
Patrick Steinhardt	2d3cb4b4b5	reftable/stack: fix zero-sized allocation when there are no readers Similar as the preceding commit, we may try to do a zero-sized allocation when reloading a reftable stack that ain't got any tables. It is implementation-defined whether malloc(3p) returns a NULL pointer in that case or a zero-sized object. In case it does return a NULL pointer though it causes us to think we have run into an out-of-memory situation, and thus we return an error. Fix this by only allocating arrays when they have at least one entry. Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-22 00:58:23 -08:00
Patrick Steinhardt	5ab83521cf	reftable/merged: fix zero-sized allocation when there are no readers It was reported [1] that Git started to fail with an out-of-memory error when initializing repositories with the reftable backend on NonStop platforms. A bisect led to `802c0646ac` (reftable/merged: handle allocation failures in `merged_table_init_iter()`, 2024-10-02), which changed how we allocate memory when initializing a merged table. The root cause of this seems to be that NonStop returns a `NULL` pointer when doing a zero-sized allocation. This would've already happened before the above change, but we never noticed because we did not check the result. Now we do notice and thus return an out-of-memory error to the caller. Fix the issue by skipping the allocation altogether in case there are no readers. [1]: <00ad01db5017$aa9ce340$ffd6a9c0$@nexbridge.com> Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-22 00:58:23 -08:00
Patrick Steinhardt	8e27ee9220	reftable/stack: don't perform auto-compaction with less than two tables In order to compact tables we need at least two tables. Bail out early from `reftable_stack_auto_compact()` in case we have less than two tables. In the original, `stack_table_sizes_for_compaction()` yields an array that has the same length as the number of tables. This array is then passed on to `suggest_compaction_segment()`, which returns an empty segment in case we have less than two tables. The segment is then passed to `segment_size()`, which will return `0` because both start and end of the segment are `0`. And because we only call `stack_compact_range()` in case we have a positive segment size we don't perform auto-compaction at all. Consequently, this change does not result in a user-visible change in behaviour when called with a single table. But when called with no tables this protects us against a potential out-of-memory error: `stack_table_sizes_for_compaction()` would try to allocate a zero-byte object when there aren't any tables, and that may lead to a `NULL` pointer on some platforms like NonStop which causes us to bail out with an out-of-memory error. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-22 00:58:23 -08:00
Junio C Hamano	7041902dfa	Merge branch 'ps/reftable-iterator-reuse' Optimize reading random references out of the reftable backend by allowing reuse of iterator objects. * ps/reftable-iterator-reuse: refs/reftable: reuse iterators when reading refs reftable/merged: drain priority queue on reseek reftable/stack: add mechanism to notify callers on reload refs/reftable: refactor reflog expiry to use reftable backend refs/reftable: refactor reading symbolic refs to use reftable backend refs/reftable: read references via `struct reftable_backend` refs/reftable: figure out hash via `reftable_stack` reftable/stack: add accessor for the hash ID refs/reftable: handle reloading stacks in the reftable backend refs/reftable: encapsulate reftable stack	2024-12-10 10:04:58 +09:00
Junio C Hamano	de9278127e	Merge branch 'ps/reftable-detach' Isolates the reftable subsystem from the rest of Git's codebase by using fewer pieces of Git's infrastructure. * ps/reftable-detach: reftable/system: provide thin wrapper for lockfile subsystem reftable/stack: drop only use of `get_locked_file_path()` reftable/system: provide thin wrapper for tempfile subsystem reftable/stack: stop using `fsync_component()` directly reftable/system: stop depending on "hash.h" reftable: explicitly handle hash format IDs reftable/system: move "dir.h" to its only user	2024-12-10 10:04:56 +09:00

1 2 3 4 5 ...

378 Commits