The --filters option applies the convert_to_working_tree() filter for
the path when showing the contents of a regular file blob object.
This feature comes in handy when a 3rd-party tool wants to work with
the contents of files from past revisions as if they had been checked
out, but without detouring via temporary files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The new regexec_buf() function operates on buffers with an explicitly
specified length, rather than NUL-terminated strings.
We need to use this function whenever the buffer we want to pass to
regexec() may have been mmap()ed (and is hence not NUL-terminated).
Note: the original motivation for this patch was to fix a bug where
`git diff -G <regex>` would crash. This patch converts more callers,
though, some of which explicitly allocated and constructed
NUL-terminated strings (or worse: modified read-only buffers to insert
NULs).
Some of the buffers actually may be NUL-terminated. As regexec_buf()
uses REG_STARTEND where available, but has to fall back to allocating
and constructing NUL-terminated strings where REG_STARTEND is not
available, this makes the code less efficient in the latter case.
However, given the widespread support for REG_STARTEND, combined with
the improved ease of code maintenance, we strike the balance in favor
of REG_STARTEND.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We just introduced a test that demonstrates that our sloppy use of
regexec() on a mmap()ed area can result in incorrect results or even
hard crashes.
So what we need to fix this is a function that calls regexec() on a
length-delimited, rather than a NUL-terminated, string.
Happily, there is an extension to regexec() introduced by the NetBSD
project and present in all major regex implementation including
Linux', MacOSX' and the one Git includes in compat/regex/: by using
the (non-POSIX) REG_STARTEND flag, it is possible to tell the
regexec() function that it should only look at the offsets between
pmatch[0].rm_so and pmatch[0].rm_eo.
That is exactly what we need.
Since support for REG_STARTEND is so widespread by now, let's just
introduce a helper function that uses it, and fall back to allocating
and constructing a NUL-terminated when REG_STARTEND is not available.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When our pickaxe code feeds file contents to regexec(), it implicitly
assumes that the file contents are read into implicitly NUL-terminated
buffers (i.e. that we overallocate by 1, appending a single '\0').
This is not so.
In particular when the file contents are simply mmap()ed, we can be
virtually certain that the buffer is preceding uninitialized bytes, or
invalid pages.
Note that the test we add here is known to be flakey: we simply cannot
know whether the byte following the mmap()ed ones is a NUL or not.
Typically, on Linux the test passes. On Windows, it fails virtually
every time due to an access violation (that's a segmentation fault for
you Unix-y people out there). And Windows would be correct: the
regexec() call wants to operate on a regular, NUL-terminated string,
there is no NUL in the mmap()ed memory range, and it is undefined
whether the next byte is even legal to access.
When run with --valgrind it demonstrates quite clearly the breakage, of
course.
Being marked with `test_expect_failure`, this test will sometimes be
declare "TODO fixed", even if it only passes by mistake.
This test case represents a Minimal, Complete and Verifiable Example of
a breakage reported by Chris Sidi.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This operation has quadratic complexity, which is especially painful
on Windows, where shell scripts are *already* slow (mainly due to the
overhead of the POSIX emulation layer).
Let's reimplement this with linear complexity (using a hash map to
match the commits' subject lines) for the common case; Sadly, the
fixup/squash feature's design neglected performance considerations,
allowing arbitrary prefixes (read: `fixup! hell` will match the
commit subject `hello world`), which means that we are stuck with
quadratic performance in the worst case.
The reimplemented logic also happens to fix a bug where commented-out
lines (representing empty patches) were dropped by the previous code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `git commit --fixup` command unwraps wrapped onelines when
constructing the commit message, without wrapping the result.
We need to make sure that `git rebase --autosquash` keeps handling such
cases correctly, in particular since we are about to move the autosquash
handling into the rebase--helper.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.
Note: The original code did not try to skip unnecessary picks of root
commits but punts instead (probably --root was not considered common
enough of a use case to bother optimizing). We do the same, for now.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These tests were a bit anal about the *exact* warning/error message
printed by git rebase. But those messages are intended for the *end
user*, therefore it does not make sense to test so rigidly for the
*exact* wording.
In the following, we will reimplement the missing commits check in
the sequencer, with slightly different words.
So let's just test for the parts in the warning/error message that
we *really* care about, nothing more, nothing less.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is crucial to improve performance on Windows, as the speed is now
mostly dominated by the SHA-1 transformation (because it spawns a new
rev-parse process for *every* line, and spawning processes is pretty
slow from Git for Windows' MSYS2 Bash).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
To avoid problems with short SHA-1s that become non-unique during the
rebase, we rewrite the todo script with short/long SHA-1s before and
after letting the user edit the script. Since SHA-1s are not intuitive
for humans, rebase -i also provides the onelines (commit message
subjects) in the script, purely for the user's convenience.
It is very possible to generate a todo script via different means than
rebase -i and then to let rebase -i run with it; In this case, these
onelines are not required.
And this is where the expand/collapse machinery has a bug: it *expects*
that oneline, and failing to find one reuses the previous SHA-1 as
"oneline".
It was most likely an oversight, and made implementation in the (quite
limiting) shell script language less convoluted. However, we are about
to reimplement performance-critical parts in C (and due to spawning a
git.exe process for every single line of the todo script, the
expansion/collapsing of the SHA-1s *is* performance-hampering on
Windows), therefore let's fix this bug to make cross-validation with the
C version of that functionality possible.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The commands used to be indented, and it is nice to look at, but when we
transform the SHA-1s, the indentation is removed. So let's do away with it.
For the moment, at least: when we will use the upcoming rebase--helper
to transform the SHA-1s, we *will* keep the indentation and can
reintroduce it. Yet, to be able to validate the rebase--helper against
the output of the current shell script version, we need to remove the
extra indentation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The first step of an interactive rebase is to generate the so-called "todo
script", to be stored in the state directory as "git-rebase-todo" and to
be edited by the user.
Originally, we adjusted the output of `git log <options>` using a simple
sed script. Over the course of the years, the code became more
complicated. We now use shell scripting to edit the output of `git log`
conditionally, depending whether to keep "empty" commits (i.e. commits
that do not change any files).
On platforms where shell scripting is not native, this can be a serious
drag. And it opens the door for incompatibilities between platforms when
it comes to shell scripting or to Unix-y commands.
Let's just re-implement the todo script generation in plain C, using the
revision machinery directly.
This is substantially faster, improving the speed relative to the
shell script version of the interactive rebase from 2x to 3x on Windows.
Note that the rearrange_squash() function in git-rebase--interactive
relied on the fact that we set the "format" variable to the config setting
rebase.instructionFormat. Relying on a side effect like this is no good,
hence we explicitly perform that assignment (possibly again) in
rearrange_squash().
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Now that the sequencer learned to process a "normal" interactive rebase,
we use it. The original shell script is still used for "non-normal"
interactive rebases, i.e. when --root or --preserve-merges was passed.
Please note that the --root option (via the $squash_onto variable) needs
special handling only for the very first command, hence it is still okay
to use the helper upon continue/skip.
Also please note that the --no-ff setting is volatile, i.e. when the
interactive rebase is interrupted at any stage, there is no record of
it. Therefore, we have to pass it from the shell script to the
rebase--helper.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git's interactive rebase is still implemented as a shell script, despite
its complexity. This implies that it suffers from the portability point
of view, from lack of expressibility, and of course also from
performance. The latter issue is particularly serious on Windows, where
we pay a hefty price for relying so much on POSIX.
Unfortunately, being such a huge shell script also means that we missed
the train when it would have been relatively easy to port it to C, and
instead piled feature upon feature onto that poor script that originally
never intended to be more than a slightly pimped cherry-pick in a loop.
To open the road toward better performance (in addition to all the other
benefits of C over shell scripts), let's just start *somewhere*.
The approach taken here is to add a builtin helper that at first intends
to take care of the parts of the interactive rebase that are most
affected by the performance penalties mentioned above.
In particular, after we spent all those efforts on preparing the sequencer
to process rebase -i's git-rebase-todo scripts, we implement the `git
rebase -i --continue` functionality as a new builtin, git-rebase--helper.
Once that is in place, we can work gradually on tackling the rest of the
technical debt.
Note that the rebase--helper needs to learn about the transient
--ff/--no-ff options of git-rebase, as the corresponding flag is not
persisted to, and re-read from, the state directory.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The shell script version of the interactive rebase has a very specific
final message. Teach the sequencer to print the same.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
For the benefit of e.g. the shell prompt, the interactive rebase not
only displays the progress for the user to see, but also writes it into
the msgnum/end files in the state directory.
Teach the sequencer this new trick.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The interactive rebase keeps the user informed about its progress.
If the sequencer wants to do the grunt work of the interactive
rebase, it also needs to show that progress.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is the behavior of the shell script version of the interactive
rebase, by using the `output` function defined in `git-rebase.sh`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is the behavior of the shell script version of the interactive
rebase, by using the `output` function defined in `git-rebase.sh`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This will be needed to hide the output of `git commit` when the
sequencer handles an interactive rebase's script.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In the upcoming patch, we will support rebase -i's progress
reporting. The progress skips comments but counts 'noop's.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The parsing part of a 'drop' command is almost identical to parsing a
'pick', while the operation is the same as that of a 'noop'.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The interactive rebase has the very special magic that a cherry-pick
that exits with a status different from 0 and 1 signifies a failure to
even record that a cherry-pick was started.
This can happen e.g. when a fast-forward fails because it would
overwrite untracked files.
In that case, we must reschedule the command that we thought we already
had at least started successfully.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The sequencer already has an idea about using different merge
strategies. We just piggy-back on top of that, using rebase -i's
own settings, when running the sequencer in interactive rebase mode.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git's `rebase` command inspects the `rebase.autostash` config setting
to determine whether it should stash any uncommitted changes before
rebasing and re-apply them afterwards.
As we introduce more bits and pieces to let the sequencer act as
interactive rebase's backend, here is the part that adds support for
the autostash feature.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When continuing after a `pick` command failed, we want that commit
to show up in the rewritten-list (and its notes to be rewritten), too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When rebasing commits that have commit notes attached, the interactive
rebase rewrites those notes faithfully at the end. The sequencer must
do this, too, if it wishes to do interactive rebase's job.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We already used the same reflog message as the scripted version of rebase
-i when finishing. With this commit, we do that also for all the commands
before that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This makes the code DRYer, with the obvious benefit that we can enhance
the code further in a single place.
We can also reuse the functionality elsewhere by calling this new
function.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The sequencer already knew how to fast-forward instead of
cherry-picking, if possible.
We want to continue to do this, of course, but in case of the 'reword'
command, we will need to call `git commit` after fast-forwarding.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is now trivial, as all the building blocks are in place: all we need
to do is to flip the "edit" switch when committing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When doing an interactive rebase, we want to leave a 'patch' file for
further inspection by the user (even if we never tried to actually apply
that patch, since we're cherry-picking instead).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
An interactive rebase operates on a detached HEAD (to keep the reflog
of the original branch relatively clean), and updates the branch only
at the end.
Now that the sequencer learns to perform interactive rebases, it also
needs to learn the trick to update the branch before removing the
directory containing the state of the interactive rebase.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the last command of an interactive rebase fails, the user needs to
resolve the problem and then continue the interactive rebase. Naturally,
the todo script is empty by then. So let's not complain about that!
To that end, let's move that test out of the function that parses the
todo script, and into the more high-level function read_populate_todo().
This is also necessary by now because the lower-level parse_insn_buffer()
has no idea whether we are performing an interactive rebase or not.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When a cherry-pick continues without a "todo script", the intention is
simply to pick a single commit.
However, when an interactive rebase is continued without a "todo
script", it means that the last command has been completed and that we
now need to clean up.
This commit guards the revert/cherry-pick specific steps so that they
are not executed in rebase -i mode.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When an interactive rebase is interrupted, the user may stage changes
before continuing, and we need to commit those changes in that case.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the interactive rebase aborts, it writes out an author-script file
to record the author information for the current commit. As we are about
to teach the sequencer how to perform the actions behind an interactive
rebase, it needs to write those author-script files, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
For users' convenience, most rebase commands can be abbreviated, e.g.
'p' instead of 'pick' and 'x' instead of 'exec'. Let's teach the
sequencer to handle those abbreviated commands just fine.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is a huge patch, and at the same time a huge step forward to
execute the performance-critical parts of the interactive rebase in a
builtin command.
Since 'fixup' and 'squash' are not only similar, but also need to know
about each other (we want to reduce a series of fixups/squashes into a
single, final commit message edit, from the user's point of view), we
really have to implement them both at the same time.
Most of the actual work is done by the existing code path that already
handles the "pick" and the "edit" commands; We added support for other
features (e.g. to amend the commit message) in the patches leading up to
this one, yet there are still quite a few bits in this patch that simply
would not make sense as individual patches (such as: determining whether
there was anything to "fix up" in the "todo" script, etc).
In theory, it would be possible to reuse the fast-forward code path also
for the fixup and the squash code paths, but in practice this would make
the code less readable. The end result cannot be fast-forwarded anyway,
therefore let's just extend the cherry-picking code path for now.
Since the sequencer parses the entire `git-rebase-todo` script in one go,
fixup or squash commands without a preceding pick can be reported early
(in git-rebase--interactive, we could only report such errors just before
executing the fixup/squash).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In the interactive rebase, commands that were successfully processed are
not simply discarded, but appended to the 'done' file instead. This is
used e.g. to display the current state to the user in the output of
`git status` or the progress.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When calling `git rebase -i -v`, the user wants to see some statistics
after the commits were rebased. Let's show some.
The strbuf we use to perform that task will be used for other things
in subsequent commits, hence it is declared and initialized in a wider
scope than strictly needed here.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The 'exec' command is a little special among rebase -i's commands, as it
does *not* have a SHA-1 as first parameter. Instead, everything after the
`exec` command is treated as command-line to execute.
Let's reuse the arg/arg_len fields of the todo_item structure (which hold
the oneline for pick/edit commands) to point to the command-line.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>