For users' convenience, most rebase commands can be abbreviated, e.g.
'p' instead of 'pick' and 'x' instead of 'exec'. Let's teach the
sequencer to handle those abbreviated commands just fine.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is a huge patch, and at the same time a huge step forward to
execute the performance-critical parts of the interactive rebase in a
builtin command.
Since 'fixup' and 'squash' are not only similar, but also need to know
about each other (we want to reduce a series of fixups/squashes into a
single, final commit message edit, from the user's point of view), we
really have to implement them both at the same time.
Most of the actual work is done by the existing code path that already
handles the "pick" and the "edit" commands; We added support for other
features (e.g. to amend the commit message) in the patches leading up to
this one, yet there are still quite a few bits in this patch that simply
would not make sense as individual patches (such as: determining whether
there was anything to "fix up" in the "todo" script, etc).
In theory, it would be possible to reuse the fast-forward code path also
for the fixup and the squash code paths, but in practice this would make
the code less readable. The end result cannot be fast-forwarded anyway,
therefore let's just extend the cherry-picking code path for now.
Since the sequencer parses the entire `git-rebase-todo` script in one go,
fixup or squash commands without a preceding pick can be reported early
(in git-rebase--interactive, we could only report such errors just before
executing the fixup/squash).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In the interactive rebase, commands that were successfully processed are
not simply discarded, but appended to the 'done' file instead. This is
used e.g. to display the current state to the user in the output of
`git status` or the progress.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When calling `git rebase -i -v`, the user wants to see some statistics
after the commits were rebased. Let's show some.
The strbuf we use to perform that task will be used for other things
in subsequent commits, hence it is declared and initialized in a wider
scope than strictly needed here.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The 'exec' command is a little special among rebase -i's commands, as it
does *not* have a SHA-1 as first parameter. Instead, everything after the
`exec` command is treated as command-line to execute.
Let's reuse the arg/arg_len fields of the todo_item structure (which hold
the oneline for pick/edit commands) to point to the command-line.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch is a straight-forward reimplementation of the `edit`
operation of the interactive rebase command.
Well, not *quite* straight-forward: when stopping, the `edit`
command wants to write the `patch` file (which is not only the
patch, but includes the commit message and author information). To
that end, this patch requires the earlier work that taught the
log-tree machinery to respect the `file` setting of
rev_info->diffopt to write to a file stream different than stdout.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The 'noop' command is probably the most boring of all rebase -i commands
to support in the sequencer.
Which makes it an excellent candidate for this first stab to add support
for rebase -i's commands to the sequencer.
For the moment, let's also treat empty lines and commented-out lines as
'noop'; We will refine that handling later in this patch series.
To make it easier to identify "classes" of todo_commands (such as:
determine whether a command is pick-like, i.e. handles a single commit),
let's enforce a certain order of said commands.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch introduces a new action for the sequencer. It really does not
do a whole lot of its own right now, but lays the ground work for
patches to come. The intention, of course, is to finally make the
sequencer the work horse of the interactive rebase (the original idea
behind the "sequencer" concept).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It is actually not safe to look for a commit message by looking for the
first empty line and skipping it.
The find_commit_subject() function looks more carefully, so let's use
it. Since we are interested in the entire commit message, we re-compute
the string length after verifying that the commit subject is not empty
(in which case the entire commit message would be empty, something that
should not happen but that we want to handle gracefully).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It is the current coding style of the Git project to write
if (...) {
...
} else {
...
}
instead of putting the closing brace and the "else" keyword on separate
lines.
Pointed out by Junio Hamano.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This was noticed while addressing Junio Hamano's concern that some
"else" operators were on separate lines than the preceding closing
brace.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is uglier than a simple
touch "$GIT_EXEC_PATH/use-builtin-difftool"
of course. But oh well.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch gives life to the skeleton added in the previous patch.
The motivation for converting the difftool is that Perl scripts are not at
all native on Windows, and that `git difftool` therefore is pretty slow on
that platform, when there is no good reason for it to be slow.
In addition, Perl does not really have access to Git's internals. That
means that any script will always have to jump through unnecessary
hoops.
The current version of the builtin difftool does not, however, make full
use of the internals but instead chooses to spawn a couple of Git
processes, still, to make for an easier conversion. There remains a lot
of room for improvement, left for a later date.
Note: to play it safe, the original difftool is still called unless the
config setting difftool.useBuiltin is set to true.
The reason: this new, experimental, builtin difftool will be shipped as
part of Git for Windows v2.11.0, to allow for easier large-scale
testing, but of course as an opt-in feature.
Sadly, the speedup is more noticable on Linux than on Windows: a quick
test shows that t7800-difftool.sh runs in (2.183s/0.052s/0.108s)
(real/user/sys) in a Linux VM, down from (6.529s/3.112s/0.644s), while
on Windows, it is (36.064s/2.730s/7.194s), down from
(47.637s/2.407s/6.863s). The culprit is most likely the overhead
incurred from *still* having to shell out to mergetool-lib.sh and
difftool--helper.sh.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This adds a builtin difftool that still falls back to the legacy Perl
version, which has been renamed to `legacy-difftool`.
The idea is that the new, experimental, builtin difftool immediately hands
off to the legacy difftool for now, unless the config variable
difftool.useBuiltin is set to true.
This feature flag will be used in the upcoming Git for Windows v2.11.0
release, to allow early testers to opt-in to use the builtin difftool and
flesh out any bugs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It is common in corporate setups to have permissions managed via a
domain account. That means that the user does not really have to log in
when accessing a central repository via https://, but that the login
credentials are used to authenticate with that repository.
The common way to do that used to require empty credentials, i.e. hitting
Enter twice when being asked for user name and password, or by using the
very funny notation https://:@server/repository
A recent commit (5275c3081c (http: http.emptyauth should allow empty (not
just NULL) usernames, 2016-10-04)) broke that usage, though, all of a
sudden requiring users to set http.emptyAuth = true.
Which brings us to the bigger question why http.emptyAuth defaults to
false, to begin with.
It would be one thing if cURL would not let the user specify credentials
interactively after attempting NTLM authentication (i.e. login
credentials), but that is not the case.
It would be another thing if attempting NTLM authentication was not
usually what users need to do when trying to authenticate via https://.
But that is also not the case.
So let's just go ahead and change the default, and unbreak the NTLM
authentication. As a bonus, this also makes the "you need to hit Enter
twice" (which is hard to explain: why enter empty credentials when you
want to authenticate with your login credentials?) and the ":@" hack
(which is also pretty, pretty hard to explain to users) obsolete.
This fixes https://github.com/git-for-windows/git/issues/987
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach add_index_entry_with_check() and has_dir_name()
to see if the path of the new item is greater than the
last path in the index array before attempting to search
for it.
This is a performance optimization.
During checkout, merge_working_tree() populates the new
index in sorted order, so this change saves at least 2
lookups per file.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
This is a performance optimization.
Teach do_read_index() to call verify_hdr() using a thread
and allow SHA1 verification to run concurrently with the
parsing of index-entries and extensions.
For large index files, this cuts the startup time in half.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
In the years-old effort to clean up the PATH a bit, we deprecated dashed
invocations.
But we did not hold ourselves to that. Until now.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The original idea of deprecating invocations of Git subcommands via
their dashed form was to be able to ship without having to hard-link
each and every builtin to its dashed form.
We need to follow this plan ourselves, by patching our very own scripts
accordingly.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A long time ago, in turn a long time after introducing builtins and
moving Git's scripts to libexec/git-core/, we deprecated the invocation
of dashed commands.
The merge-octopus script is a holdover from the time before that, as it
still tries to invoke a builtin by its dashed form. Let's just not.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The invocation of dashed Git commands was rightfully deprecated a long
time ago. We failed to heed that deprecation ourselves, but it is never
too late...
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We deprecated the dashed invocations ages ago. Might just as well hold
ourselves to that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We started deprecating the dashed form a long time ago, advertising the
fact that dashed invocations of Git commands are deprecated for several
major versions.
Yet we still used them ourselves.
With ashes on our heads, we now start to finally get rid of those calls.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is one of the few places where Git violates its own deprecation of
the dashed form. It is not necessary, either.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach "add" to use preload-index and fscache features
to improve performance on very large repositories.
During an "add", a call is made to run_diff_files()
which calls check_remove() for each index-entry. This
calls lstat(). On Windows, the fscache code intercepts
the lstat() calls and builds a private cache using the
FindFirst/FindNext routines, which are much faster.
Somewhat independent of this, is the preload-index code
which distributes some of the start-up costs across
multiple threads.
We need to keep the call to read_cache() before parsing the
pathspecs (and hence cannot use the pathspecs to limit any preload)
because parse_pathspec() is using the index to determine whether a
pathspec is, in fact, in a submodule. If we would not read the index
first, parse_pathspec() would not error out on a path that is inside
a submodule, and t7400-submodule-basic.sh would fail with
not ok 47 - do not add files from a submodule
We still want the nice preload performance boost, though, so we simply
call read_cache_preload(&pathspecs) after parsing the pathspecs.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach hash_dir_entry() to remember the previously found dir_entry
during lazy_init_name_hash() iteration. This is a performance
optimization. Since items in the index array are sorted by full
pathname, adjacent items are likely to be in the same directory.
This can save memihash() computations and HashMap lookups.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
The purpose of this function is to stat() the files listed in the index
in a multi-threaded fashion. It is called directly after reading the
index in the read_index_preloaded() function.
However, in some cases we may want to separate the index reading from
the preloading step, e.g. in builtin/add.c, where we need to load the
index before we parse the pathspecs (which needs to error out if one of
the pathspecs refers to a path within a submodule, for which the index
must have been read already), and only then will we want to preload,
possibly limited by the just-parsed pathspecs.
So let's just export that function to allow calling it separately.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Specify an initial size for the istate.dir_hash HashMap matching
the size of the istate.name_hash.
Previously hashmap_init() was given 0, causing a 64 bucket
hashmap to be created. When working with very large
repositories, this would cause numerous rehash() calls to
realloc and rebalance the hashmap. This is especially true
when the worktree is deep, with many directories containing
a few files.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Precompute the istate.name_hash and istate.dir_hash values
for each cache-entry during the preload-index phase.
Move the expensive memihash() calculations from lazy_init_name_hash()
to the multi-threaded preload-index phase.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add variant of memihash() to allow the hash computation to
be continued. There are times when we compute the hash on
a full path and then the hash on just the path to the parent
directory. This can be expensive on large repositories.
With this, we can hash the parent directory first. And then
continue the computation to include the "/filename".
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Remove duplicate memihash() call in hash_dir_entry().
The existing code called memihash() to do the find_dir_entry()
and it not found, called memihash() again to do the hashmap_add().
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach preload-index to avoid lstat() calls for index-entries
with skip-worktree bit set. This is a performance optimization.
During a sparse-checkout, the skip-worktree bit is set on items
that were not populated and therefore are not present in the
worktree. The per-thread preload-index loop performs a series
of tests on each index-entry as it attempts to compare the
worktree version with the index and mark them up-to-date.
This patch short-cuts that work.
On a Windows 10 system with a very large repo (450MB index)
and various levels of sparseness, performance was improved
in the {preloadindex=true, fscache=false} case by 80% and
in the {preloadindex=true, fscache=true} case by 20% for various
commands.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
When using cvsnt + msys + git, it seems like the output of cvs status
had \r\n in it, and caused the command to fail.
This fixes that.
Signed-off-by: Dustin Spicuzza <dustin@virtualroadside.com>
Just like with other Git commands, this option makes it read the paths
from the standard input. It comes in handy when resetting many, many
paths at once and wildcards are not an option (e.g. when the paths are
generated by a tool).
Note: we first parse the entire list and perform the actual reset action
only in a second phase. Not only does this make things simpler, it also
helps performance, as do_diff_cache() traverses the index and the
(sorted) pathspecs in simultaneously to avoid unnecessary lookups.
This feature is marked experimental because it is still under review in
the upstream Git project.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, strftime() does not silently ignore invalid formats, but
warns about them and then returns 0 and sets errno to EINVAL.
Unfortunately, Git does not expect such a behavior, as it disagrees
with strftime()'s semantics on Linux. As a consequence, Git
misinterprets the return value 0 as "I need more space" and grows the
buffer. As the larger buffer does not fix the format, the buffer grows
and grows and grows until we are out of memory and abort.
Ideally, we would switch off the parameter validation just for
strftime(), but we cannot even override the invalid parameter handler
via _set_thread_local_invalid_parameter_handler() using MINGW because
that function is not declared. Even _set_invalid_parameter_handler(),
which *is* declared, does not help, as it simply does... nothing.
So let's just bite the bullet and override strftime() for MINGW and
abort on an invalid format string. While this does not provide the
best user experience, it is the best we can do.
See https://msdn.microsoft.com/en-us/library/fe06s4ak.aspx for more
details.
This fixes https://github.com/git-for-windows/git/issues/863
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In particular when local tags are used (or tags that are pushed to some
fork) to build Git, it is very hard to figure out from which particular
revision a particular Git executable was built.
Let's just report that in our build options.
We need to be careful, though, to report when the current commit cannot be
determined, e.g. when building from a tarball without any associated Git
repository.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add preliminary support for detection of the build plaform, and reporting
of same with the `git version --build-options' command. This can be useful
for bug reporting, to distinguish between 32 and 64-bit builds for
example.
The current implementation can only distinguish between x86 and x86_64.
This will be extended in future patches. In addition, all 32-bit variants
(i686, i586, etc.) are collapsed into `x86'. An example of the output is:
$ git version --build-options
git version 2.9.3.windows.2.826.g06c0f2f
sizeof-long: 4
machine: x86_64
The label of `machine' was chosen so the new information will approximate
the output of `uname -m'.
Signed-off-by: Adric Norris <landstander668@gmail.com>
This is a brown paper bag. When adding the tests, we actually failed
to verify that the config variable is heeded in git-init at all. And
when changing the original patch that marked the .git/ directory as
hidden after reading the config, it was lost on this developer that
the new code would use the hide_dotfiles variable before the config
was read.
The fix is obvious: read the (limited, pre-init) config *before*
creating the .git/ directory.
This fixes https://github.com/git-for-windows/git/issues/789
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch introduces a highly-experimental feature allowing to
override stdin/stdout/stderr by setting environment variables e.g. to
named pipes, solving a problem in highly multi-threaded applications
where inheritable handles could cause blocked Git operations.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When a third-party tool periodically runs `git status` in order to keep
track of the state of the working tree, it is a bad idea to lock the
index: it might interfere with interactive commands executed by the
user, e.g. when the user wants to commit files.
Let's introduce the option `--no-lock-index` to prevent such problems.
The idea is that the third-party tool calls `git status` with this
option, preventing it from ever updating the index.
The downside is that the periodic `git status` calls will be a little
bit more wasteful because they may have to refresh the index repeatedly,
only to throw away the updates when it exits. This cannot really be
helped, though, as tools wanting to get a periodic update of the status
have no way to predict when the user may want to lock the index herself.
Note that the regression test added in this commit does not *really*
verify that no index.lock file was written; that test is not possible in
a portable way. Instead, we verify that .git/index is rewritten *only*
when `git status` is run without `--no-lock-index`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
issue #790: git bundle create does not close handle to *.lock file
This problem happens when an user tries to create an empty bundle, using the
following command: `git bundle create <bundle> <revlist>` and when <revlist>
resolve to an empty list (for example, like `master..master`), `git bundle` fails
and warn the user about how it don't want to create empty bundle.
In that case, git tries to delete the `<bundle>.lock` file, and since there's still
an open file handle, fails to do so and ask the user if it should retry (which will
fail again).
The lock can still be deleted manually by the user (and it is required if the user
want to create a bundle after revising his rev-list).
Signed-off-by: Gaël Lhez <gael.lhez@gmail.com>
On Windows, files cannot be removed nor renamed if there are still
handles held by a process. To remedy that, we introduced the
close_all_packs() function.
Earlier, we made sure that the packs are released just before `git gc`
is spawned, in case that gc wants to remove no-longer needed packs.
But this developer forgot that gc itself also needs to let go of packs,
e.g. when consolidating all packs via the --aggressive option.
Likewise, `git repack -d` wants to delete obsolete packs and therefore
needs to close all pack handles, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This branch introduces support for reading the "Windows-wide" Git
configuration from `%PROGRAMDATA%\Git\config`. As these settings are
intended to be shared between *all* Git-related software, that config
file takes an even lower precedence than `$(prefix)/etc/gitconfig`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>