After calling "parse_timestamp(p, &end, 10)", we complain if "p == end",
which would imply that we did not see any digits at all. But we know
this cannot be the case, since we would have bailed already if we did
not see any digits, courtesy of extra checks added by 8e4309038f (fsck:
do not assume NUL-termination of buffers, 2023-01-19). Since then,
checking "p == end" is redundant and we can drop it.
This will make our lives a little easier as we refactor further.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We may be operating on a buffer that is not NUL-terminated, but we use
strcspn() to parse it. This is OK in practice, as discussed in
8e4309038f (fsck: do not assume NUL-termination of buffers, 2023-01-19),
because we know there is at least a trailing newline in our buffer, and
we always pass "\n" to strcspn(). So we know it will stop before running
off the end of the buffer.
But this is a subtle point to hang our memory safety hat on. And it
confuses ASan's strict_string_checks mode, even though it is technically
a false positive (that mode complains that we have no NUL, which is
true, but it does not know that we have verified the presence of the
newline already).
Let's instead open-code the loop. As a bonus, this makes the logic more
obvious (to my mind, anyway). The current code skips forward with
strcspn until it hits "<", ">", or "\n". But then it must check which it
saw to decide if that was what we expected or not, duplicating some
logic between what's in the strcspn() and what's in the domain logic.
Instead, we can just check each character as we loop and act on it
immediately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fsck code purports to handle buffers that are not NUL-terminated,
but fsck_ident() uses some string functions. This works OK in practice,
as explained in 8e4309038f (fsck: do not assume NUL-termination of
buffers, 2023-01-19). Before calling fsck_ident() we'll have called
verify_headers(), which makes sure we have at least a trailing newline.
And none of our string-like functions will walk past that newline.
However, that makes this code at the top of fsck_ident() very confusing:
*ident = strchrnul(*ident, '\n');
if (**ident == '\n')
(*ident)++;
We should always see that newline, or our memory safety assumptions have
been violated! Further, using strchrnul() is weird, since the whole
point is that if the newline is not there, we don't necessarily have a
NUL at all, and might read off the end of the buffer.
So let's have callers pass in the boundary of our buffer, which lets us
safely find the newline with memchr(). And if it is not there, this is a
BUG(), because it means our caller did not validate the input with
verify_headers() as it was supposed to (and we are better off bailing
rather than having memory-safety problems).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A cache-tree extension entry in the index looks like this:
<name> NUL <entry_nr> SPACE <subtree_nr> NEWLINE <binary_oid>
where the "_nr" items are human-readable base-10 ASCII. We parse them
with strtol(), even though we do not have a NUL-terminated string (we'd
generally have an mmap() of the on-disk index file). For a well-formed
entry, this is not a problem; strtol() will stop when it sees the
newline. But there are two problems:
1. A corrupted entry could omit the newline, causing us to read
further. You'd mostly get stopped by seeing non-digits in the oid
field (and if it is likewise truncated, there will still be 20 or
more bytes of the index checksum). So it's possible, though
unlikely, to read off the end of the mmap'd buffer. Of course a
malicious index file can fake the oid and the index checksum to all
(ASCII) 0's.
This is further complicated by the fact that mmap'd buffers tend to
be zero-padded up to the page boundary. So to run off the end, the
index size also has to be a multiple of the page size. This is also
unlikely, though you can construct a malicious index file that
matches this.
The security implications aren't too interesting. The index file is
a local file anyway (so you can't attack somebody by cloning, but
only if you convince them to operate in a .git directory you made,
at which point attacking .git/config is much easier). And it's just
a read overflow via strtol(), which is unlikely to buy you much
beyond a crash.
2. ASan has a strict_string_checks option, which tells it to make sure
that options to string functions (like strtol) have some eventual
NUL, without regard to what the function would actually do (like
stopping at a newline here). This option sometimes has false
positives, but it can point to sketchy areas (like this one) where
the input we use doesn't exhibit a problem, but different input
_could_ cause us to misbehave.
Let's fix it by just parsing the values ourselves with a helper function
that is careful not to go past the end of the buffer. There are a few
behavior changes here that should not matter:
- We do not consider overflow, as strtol() would. But nor did the
original code. However, we don't trust the value we get from the
on-disk file, and if it says to read 2^30 entries, we would notice
that we do not have that many and bail before reading off the end of
the buffer.
- Our helper does not skip past extra leading whitespace as strtol()
would, but according to gitformat-index(5) there should not be any.
- The original quit parsing at a newline or a NUL byte, but now we
insist on a newline (which is what the documentation says, and what
Git has always produced).
Since we are providing our own helper function, we can tweak the
interface a bit to make our lives easier. The original code does not use
strtol's "end" pointer to find the end of the parsed data, but rather
uses a separate loop to advance our "buf" pointer to the trailing
newline. We can instead provide a helper that advances "buf" as it
parses, letting us read strictly left-to-right through the buffer.
I didn't add a new test here. It's surprisingly difficult to construct
an index of exactly the right size due to the way we pad entries. But it
is easy to trigger the problem in existing tests when using ASan's
strict string checking, coupled with a recent change to use NO_MMAP with
ASan builds. So:
make SANITIZE=address
cd t
ASAN_OPTIONS=strict_string_checks=1 ./t0090-cache-tree.sh
triggers it reliably. Technically it is not deterministic because there
is ~8% chance (it's 1-(255/256)^20, or ^32 for sha256) that the trailing
checksum hash has a NUL byte in it. But we compute enough cache-trees in
the course of that script that we are very likely to hit the problem in
one of them.
We can look at making strict_string_checks the default for ASan builds,
but there are some other cases we'd want to fix first.
Reported-by: correctmost <cmlists@sent.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git often uses mmap() to access on-disk files. This leaves a blind spot
in our SANITIZE=address builds, since ASan does not seem to handle mmap
at all. Nor does the OS notice most out-of-bounds access, since it tends
to round up to the nearest page size (so depending on how big the map
is, you might have to overrun it by up to 4095 bytes to trigger a
segfault).
The previous commit demonstrates a memory bug that we missed. We could
have made a new test where the out-of-bounds access was much larger, or
where the mapped file ended closer to a page boundary. But the point of
running the test suite with sanitizers is to catch these problems
without having to construct specific tests.
Let's enable NO_MMAP for our ASan builds by default, which should give
us better coverage. This does increase the memory usage of Git, since
we're copying from the filesystem into heap. But the repositories in the
test suite tend to be small, so the overhead isn't really noticeable
(and ASan already has quite a performance penalty).
There are a few other known bugs that this patch will help flush out.
However, they aren't directly triggered in the test suite (yet). So
it's safe to turn this on now without breaking the test suite, which
will help us add new tests to demonstrate those other bugs as we fix
them.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a bitmap has a name-hash cache, it is an array of 32-bit integers,
one per entry in the bitmap, which we've mmap'd from the .bitmap file.
We access it directly like this:
if (bitmap_git->hashes)
hash = get_be32(bitmap_git->hashes + index_pos);
That works for both regular pack bitmaps and for non-incremental midx
bitmaps. There is one bitmap_index with one "hashes" array, and
index_pos is within its bounds (we do the bounds-checking when we load
the bitmap).
But for an incremental midx bitmap, we have a linked list of
bitmap_index structs, and each one has only its own small slice of the
name-hash array. If index_pos refers to an object that is not in the
first bitmap_git of the chain, then we'll access memory outside of the
bounds of its "hashes" array, and often outside of the mmap.
Instead, we should walk through the list until we find the bitmap_index
which serves our index_pos, and use its hash (after adjusting index_pos
to make it relative to the slice we found). This is exactly what we do
elsewhere for incremental midx lookups (like the pack_pos_to_midx() call
a few lines above). But we can't use existing helpers like
midx_for_object() here, because we're walking through the chain of
bitmap_index structs (each of which refers to a midx), not the chain of
incremental multi_pack_index structs themselves.
The problem is triggered in the test suite, but we don't get a segfault
because the out-of-bounds index is too small. The OS typically rounds
our mmap up to the nearest page size, so we just end up accessing some
extra zero'd memory. Nor do we catch it with ASan, since it doesn't seem
to instrument mmaps at all. But if we build with NO_MMAP, then our maps
are replaced with heap allocations, which ASan does check. And so:
make NO_MMAP=1 SANITIZE=address
cd t
./t5334-incremental-multi-pack-index.sh
does show the problem (and this patch makes it go away).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our mmap compat code emulates mapping by using malloc/free. Our
git_munmap() must take a "length" parameter to match the interface of
munmap(), but we don't use it (it is up to the allocator to know how big
the block is in free()).
Let's mark it as UNUSED to avoid complaints from -Wunused-parameter.
Otherwise you cannot build with "make DEVELOPER=1 NO_MMAP=1".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This turns into a no-op merge, since more recent versions of Git
newer than 2.46 track do support the newer "git config" syntax.
* maint-2.45:
t: avoid git config syntax from newer releases
In a recent security release, 05e9cd64ee (config: quote values
containing CR character, 2025-05-19) added calls to `git config get`,
`git config set`, and `git config unset` which are not present on the
maint-2.43 branch.
These subcommands were added in the following commits, released in
git-2.46.0:
4e51389000 (builtin/config: introduce "get" subcommand, 2024-05-06),
00bbdde141 (builtin/config: introduce "set" subcommand, 2024-05-06),
95ea69c67b (builtin/config: introduce "unset" subcommand, 2024-05-06)
Revert to the previous `git config` syntax for older maintenance
branches.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When preparing the latest round of security fixes, we wrote release
notes in v2.43.7, and then successively merged those up through to the
various 'maint' branches.
However, the 2.49 release series is the first to have commit 1f010d6bdf
(doc: use .adoc extension for AsciiDoc files, 2025-01-20). This means
that we should have renamed the new-but-historical release notes from
*.txt to *.adoc during the merge into the 'maint-2.49' branch, but
neglected to do so.
Rename them accordingly to match the convention introduced by
1f010d6bdf. Since the release materials in question here were prepared
before v2.50.0 was tagged, the 'maint' track for that release series is
OK as is.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixes for GitHub Actions Coverity job.
* js/github-ci-win-coverity-fix:
ci(coverity): output the build log upon error
ci(coverity): fix building on Windows
Revert a botched bswap.h change that broke ntohll() functions on
big-endian systems with __builtin_bswap32/64().
* ss/revert-builtin-bswap-stuff:
Revert "bswap.h: add support for built-in bswap functions"
* maint-2.48:
Git 2.48.2
Git 2.47.3
Git 2.46.4
Git 2.45.4
Git 2.44.4
Git 2.43.7
wincred: avoid buffer overflow in wcsncat()
bundle-uri: fix arbitrary file writes via parameter injection
config: quote values containing CR character
git-gui: sanitize 'exec' arguments: convert new 'cygpath' calls
git-gui: do not mistake command arguments as redirection operators
git-gui: introduce function git_redir for git calls with redirections
git-gui: pass redirections as separate argument to git_read
git-gui: pass redirections as separate argument to _open_stdout_stderr
git-gui: convert git_read*, git_write to be non-variadic
git-gui: override exec and open only on Windows
gitk: sanitize 'open' arguments: revisit recently updated 'open' calls
git-gui: use git_read in githook_read
git-gui: sanitize $PATH on all platforms
git-gui: break out a separate function git_read_nice
git-gui: assure PATH has only absolute elements.
git-gui: remove option --stderr from git_read
git-gui: cleanup git-bash menu item
git-gui: sanitize 'exec' arguments: background
git-gui: avoid auto_execok in do_windows_shortcut
git-gui: sanitize 'exec' arguments: simple cases
git-gui: avoid auto_execok for git-bash menu item
git-gui: treat file names beginning with "|" as relative paths
git-gui: remove unused proc is_shellscript
git-gui: remove git config --list handling for git < 1.5.3
git-gui: remove special treatment of Windows from open_cmd_pipe
git-gui: remove HEAD detachment implementation for git < 1.5.3
git-gui: use only the configured shell
git-gui: remove Tcl 8.4 workaround on 2>@1 redirection
git-gui: make _shellpath usable on startup
git-gui: use [is_Windows], not bad _shellpath
git-gui: _which, only add .exe suffix if not present
gitk: encode arguments correctly with "open"
gitk: sanitize 'open' arguments: command pipeline
gitk: collect construction of blameargs into a single conditional
gitk: sanitize 'open' arguments: simple commands, readable and writable
gitk: sanitize 'open' arguments: simple commands with redirections
gitk: sanitize 'open' arguments: simple commands
gitk: sanitize 'exec' arguments: redirect to process
gitk: sanitize 'exec' arguments: redirections and background
gitk: sanitize 'exec' arguments: redirections
gitk: sanitize 'exec' arguments: 'eval exec'
gitk: sanitize 'exec' arguments: simple cases
gitk: have callers of diffcmd supply pipe symbol when necessary
gitk: treat file names beginning with "|" as relative paths
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recently generating the version-def.h file and the config-list.h
file have been updated, which broke versions of "sed" that do not
want to be fed a file that ends with an incomplete line, and/or that
do not understand the more recent "-E" option to use extended
regular expression.
Fix them in response to a build-failure reported on Solaris boxes.
cf. https://lore.kernel.org/git/09f954b8-d9c3-418f-ad4b-9cb9b063f4ae@comstyle.com/
Reported-by: Brad Smith <brad@comstyle.com>
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 6547d1c9 (bswap.h: add support for built-in bswap
functions, 2025-04-23) tweaked the way the bswap32/64 macros are
defined, on platforms with __builtin_bswap32/64 supported, the
bswap32/64 macros are defined even on big endian platforms.
However the rest of this file assumes that bswap32/64() are defined
ONLY on little endian machines and uses that assumption to redefine
ntohl/ntohll macros. The said commit broke t4014-format-patch.sh test,
among many others on s390x.
Revert the commit.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
• Replace with phrases that are more standard (“all-or-nothing”
instead of “-none”)
• Add coordinating words that make it less likely for you to trip
over the sentence (“*that* "gc" can do”)
• Use “SMTP” instead of both SMTP and smtp
• Don’t mention `git fsck --reference` since the previous release
was not affected by this minor bug. Also say “errored out” since
the git-refs(1) bug was there in v2.48.0 as well
• Use the more widespread “linked” instead of “secondary worktree”
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is quite helpful to know what Coverity said, exactly, in case it
fails to analyze the code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When I added the Coverity workflow in a56b6230d0 (ci: add a GitHub
workflow to submit Coverity scans, 2023-09-25), I merely converted an
Azure Pipeline definition that had been running successfully for ages.
In the meantime, the current Coverity documentation describes a very
different way to install the analysis tool, recommending to add the
`bin/` directory to the _end_ of `PATH` (when originally, IIRC, it was
recommended to add it to the _beginning_ of the `PATH`).
This is crucial! The reason is that the current incarnation of the
Windows variant of Coverity's analysis tools come with a _lot_ of DLL
files in their `bin/` directory, some of them interferring rather badly
with the `gcc.exe` in Git for Windows' SDK that we use to run the
Coverity build. The symptom is a cryptic error message:
make: *** [Makefile:2960: headless-git.o] Error 1
make: *** Waiting for unfinished jobs....
D:\git-sdk-64-minimal\mingw64\bin\windres.exe: preprocessing failed.
make: *** [Makefile:2679: git.res] Error 1
make: *** [Makefile:2893: git.o] Error 1
make: *** [Makefile:2893: builtin/add.o] Error 1
Attempting to detect unconfigured compilers in build
|0----------25-----------50----------75---------100|
****************************************************
Warning: Build command make.exe exited with code 2. Please verify that the build completed successfully.
Warning: Emitted 0 C/C++ compilation units (0%) successfully
0 C/C++ compilation units (0%) are ready for analysis
For more details, please look at:
D:/a/git/git/cov-int/build-log.txt
The log (which the workflow is currently not configured to reveal) then
points out that the `windows.h` header cannot be found, which is _still_
not very helpful. The underlying root cause is that the `gcc.exe` in Git
for Windows' SDK determines the location of the header files via the
location of certain DLL files, and finding the "wrong" ones first on the
`PATH` misleads that logic.
Let's fix this problem by following Coverity's current recommendation
and append the `bin/` directory in which `cov-int` can be found to the
_end_ of `PATH`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests that compare $HOME and $(pwd), which should be the same
directory unless the tests chdir's around, would fail when the user
enters the test directory via symbolic links, which has been
corrected.
* mm/test-in-absolute-home:
t: run tests from a normalized working directory