In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.
Note: The original code did not try to skip unnecessary picks of root
commits but punts instead (probably --root was not considered common
enough of a use case to bother optimizing). We do the same, for now.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These tests were a bit anal about the *exact* warning/error message
printed by git rebase. But those messages are intended for the *end
user*, therefore it does not make sense to test so rigidly for the
*exact* wording.
In the following, we will reimplement the missing commits check in
the sequencer, with slightly different words.
So let's just test for the parts in the warning/error message that
we *really* care about, nothing more, nothing less.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is crucial to improve performance on Windows, as the speed is now
mostly dominated by the SHA-1 transformation (because it spawns a new
rev-parse process for *every* line, and spawning processes is pretty
slow from Git for Windows' MSYS2 Bash).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
To avoid problems with short SHA-1s that become non-unique during the
rebase, we rewrite the todo script with short/long SHA-1s before and
after letting the user edit the script. Since SHA-1s are not intuitive
for humans, rebase -i also provides the onelines (commit message
subjects) in the script, purely for the user's convenience.
It is very possible to generate a todo script via different means than
rebase -i and then to let rebase -i run with it; In this case, these
onelines are not required.
And this is where the expand/collapse machinery has a bug: it *expects*
that oneline, and failing to find one reuses the previous SHA-1 as
"oneline".
It was most likely an oversight, and made implementation in the (quite
limiting) shell script language less convoluted. However, we are about
to reimplement performance-critical parts in C (and due to spawning a
git.exe process for every single line of the todo script, the
expansion/collapsing of the SHA-1s *is* performance-hampering on
Windows), therefore let's fix this bug to make cross-validation with the
C version of that functionality possible.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The commands used to be indented, and it is nice to look at, but when we
transform the SHA-1s, the indentation is removed. So let's do away with it.
For the moment, at least: when we will use the upcoming rebase--helper
to transform the SHA-1s, we *will* keep the indentation and can
reintroduce it. Yet, to be able to validate the rebase--helper against
the output of the current shell script version, we need to remove the
extra indentation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The first step of an interactive rebase is to generate the so-called "todo
script", to be stored in the state directory as "git-rebase-todo" and to
be edited by the user.
Originally, we adjusted the output of `git log <options>` using a simple
sed script. Over the course of the years, the code became more
complicated. We now use shell scripting to edit the output of `git log`
conditionally, depending whether to keep "empty" commits (i.e. commits
that do not change any files).
On platforms where shell scripting is not native, this can be a serious
drag. And it opens the door for incompatibilities between platforms when
it comes to shell scripting or to Unix-y commands.
Let's just re-implement the todo script generation in plain C, using the
revision machinery directly.
This is substantially faster, improving the speed relative to the
shell script version of the interactive rebase from 2x to 3x on Windows.
Note that the rearrange_squash() function in git-rebase--interactive
relied on the fact that we set the "format" variable to the config setting
rebase.instructionFormat. Relying on a side effect like this is no good,
hence we explicitly perform that assignment (possibly again) in
rearrange_squash().
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Now that the sequencer learned to process a "normal" interactive rebase,
we use it. The original shell script is still used for "non-normal"
interactive rebases, i.e. when --root or --preserve-merges was passed.
Please note that the --root option (via the $squash_onto variable) needs
special handling only for the very first command, hence it is still okay
to use the helper upon continue/skip.
Also please note that the --no-ff setting is volatile, i.e. when the
interactive rebase is interrupted at any stage, there is no record of
it. Therefore, we have to pass it from the shell script to the
rebase--helper.
Note: the test t3404 had to be adjusted because the the error messages
produced by the sequencer comply with our current convention to start with
a lower-case letter.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git's interactive rebase is still implemented as a shell script, despite
its complexity. This implies that it suffers from the portability point
of view, from lack of expressibility, and of course also from
performance. The latter issue is particularly serious on Windows, where
we pay a hefty price for relying so much on POSIX.
Unfortunately, being such a huge shell script also means that we missed
the train when it would have been relatively easy to port it to C, and
instead piled feature upon feature onto that poor script that originally
never intended to be more than a slightly pimped cherry-pick in a loop.
To open the road toward better performance (in addition to all the other
benefits of C over shell scripts), let's just start *somewhere*.
The approach taken here is to add a builtin helper that at first intends
to take care of the parts of the interactive rebase that are most
affected by the performance penalties mentioned above.
In particular, after we spent all those efforts on preparing the sequencer
to process rebase -i's git-rebase-todo scripts, we implement the `git
rebase -i --continue` functionality as a new builtin, git-rebase--helper.
Once that is in place, we can work gradually on tackling the rest of the
technical debt.
Note that the rebase--helper needs to learn about the transient
--ff/--no-ff options of git-rebase, as the corresponding flag is not
persisted to, and re-read from, the state directory.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This commit starts the rebase of b213911f46 to 49800c9407
This merging rebase was started not because upstream's `maint` branch
advanced (it did not), but to replace a couple of patches with newer
iterations, as sent to the upstream Git project independently. In
particular, those are:
- The patch series merged by fe73baf344
(jeffhostetler/jeffhostetler/quick_add_index_entry) was replaced by
the most recent iteration of 'jh/add-index-entry-optim' (b986df5c35)
- The Pull Request jeffhostetler/jeffhostetler/string_list_realloc
merged via c775bdd100 was replaced by the latest iteration of
'jh/string-list-micro-optim' (950a234cbd)
- The jh/memihash-opt patches merged by 2862058e9d were replaced by the
newest iteration (41b3eb4a6b)
- The bug fix where difftool used a buffer after freeing it
(d33e487771) was replaced by the one that made it into upstream
(882add136f)
- The patch in the `coverity` series that tried to fix a resource leak
in git-am (a5208164e2) was replaced by a better patch submitted by
Ren_ Scharfe (ac8ce18d89)
- The patch in the `coverity` series that tried to fix a resource leak
in the `handle_ssh_variant()` function (f07be76f51) has been dropped,
as a different patch had been accepted into `pu` already
- The rebase-i-extra patches (e1be548aaf) were replaced by the latest
iteration (1d3d10b9e15)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Coverity is a tool to analyze code statically, trying to find common (or
not so common) problems before they occur in production.
Coverity offers its services to Open Source software, and just like
upstream Git, Git for Windows applied and was granted the use.
While Coverity reports a lot of false positives due to Git's (ab-)use of
the FLEX_ARRAY feature (where it declares a 0-byte or 1-byte array at the
end of a struct, and then allocates a variable-length data structure
holding a variable-length string at the end, so that the struct as well as
the string can be released with a single free()), there were a few issues
reported that are true positives, and not all of them were resource leaks
in builtins (for which it is considered kind of okay to not release memory
just before exit() is called anyway).
This topic branch tries to address a couple of those issues.
Note: there are a couple more issues left, either because they are tricky
to resolve (in some cases, the custody of occasionally-allocated memory is
very unclear) or because it is unclear whether they are false positives
(due to the hard-to-reason-about nature of the code). It's a start,
though.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch allows us to specify absolute paths without the drive
prefix e.g. when cloning.
Example:
C:\Users\me> git clone https://github.com/git/git \upstream-git
This will clone into a new directory C:\upstream-git, in line with how
Windows interprets absolute paths.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The split_cmdline() function either assigns an allocated array to the argv
parameter or NULL. In the first case, we have to free() it, in the latter
it does no harm.
Reported via Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
There is really no reason why we would need to hold onto the allocated
string longer than necessary.
Reported by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The buffer allocated by shorten_unambiguous_ref() needs to be released.
Discovered by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the `name_rev()` function is asked to dereference the tip name, it
allocates memory. But when it turns out that another tip already
described the commit better than the current one, we forgot to release
the memory.
Pointed out by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `guess_ref()` returns an allocated buffer of which `make_linked_ref()`
does not take custody (`alloc_ref()` makes a copy), therefore we need to
release the buffer afterwards.
Noticed via Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We free()d the `log` buffer when dwim_log() returned 1, but not when it
returned a larger value (which meant that it still allocated the buffer
but we simply ignored it).
Identified by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When we fail to read, or parse, the file, we still want to close the file
descriptor and release the strbuf.
Reported via Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In case of errors, we really want the file descriptor to be closed.
Discovered by a Coverity scan.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It would appear that we allocate (and forget to release) memory if the
patch ID is not even defined.
Reported by the Coverity tool.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The resource leak only happens in case of an error writing or truncating
the file, therefore it seems less critical, but we should still fix it
nonetheless.
Discovered by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When we could not convert the UTF-8 sequence into Unicode for writing to
the Console, we should not try to write an insanely-long sequence of
invalid wide characters (mistaking the negative return value for an
unsigned length).
Reported by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When stdout is not connected to a Win32 console, we incorrectly used an
uninitialized value for the "plain" character attributes.
Detected by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In the (admittedly, concocted) case that PATH consists only of colons, we
would leak the duplicated string.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When specifying an absolute path without a drive prefix, we convert that
path internally. Let's make sure that we handle that case properly, too
;-)
This fixes the command
git clone https://github.com/git-for-windows/git \G4W
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, there are several categories of absolute paths. One such
category starts with a backslash and is implicitly relative to the
drive associated with the current working directory. Example:
c:
git clone https://github.com/git-for-windows/git \G4W
should clone into C:\G4W.
There is currently a problem with that, in that mingw_mktemp() does not
expect the _wmktemp() function to prefix the absolute path with the
drive prefix, and as a consequence, the resulting path does not fit into
the originally-passed string buffer. The symptom is a "Result too large"
error.
Reported by Juan Carlos Arevalo Baeza.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach do_write_index() to close the index.lock file
before getting the mtime and updating the istate.timestamp
fields.
On Windows, a file's mtime is not updated until the file is
closed. On Linux, the mtime is set after the last flush.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add check for the end of the entries for the thread partition.
Add test for lazy init name hash with specific directory structure
The lazy init hash name was causing a buffer overflow when the last
entry in the index was multiple folder deep with parent folders that
did not have any files in them.
This adds a test for the boundary condition of the thread partitions
with the folder structure that was triggering the buffer overflow.
The fix was to check if it is the last entry for the thread partition
in the handle_range_dir and not try to use the next entry in the cache.
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>