Commit Graph

71042 Commits

Author SHA1 Message Date
Johannes Schindelin
328dc9a1fb t9001: work around hard-to-debug hangs
Just like the workaround we added for t9116, t9001.83 hangs sometimes --
but not always! -- when being run in the Git for Windows SDK.

The issue seems to be related to redirection via a pipe, but it is really
hard to diagnose, what with git.exe (a non-MSYS2 program) calling a Perl
script (which is executed by an MSYS2 Perl), piping into another MSYS2
program.

As hunting time is scarce these days, simply work around this for now and
leave the real diagnosis and resolution for later.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:39:39 +02:00
Johannes Schindelin
57c1ac75c9 t9116: work around hard-to-debug hangs
As of a couple of weeks ago, t9116 hangs sometimes -- but not always! --
when being run in the Git for Windows SDK.

The issue seems to be related to redirection via a pipe, but it is really
hard to diagnose, what with git.exe (a non-MSYS2 program) calling a Perl
script (which is executed by an MSYS2 Perl), piping into another MSYS2
program.

As hunting time is scarce these days, simply work around this for now and
leave the real diagnosis and resolution for later.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:39:39 +02:00
Johannes Schindelin
0626acfdd7 Merge branch 'interactive-rebase-current'
This series of branches introduces the git-rebase--helper, a builtin
helping to accelerate the interactive rebase dramatically.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:39:39 +02:00
Johannes Schindelin
6645271595 Merge pull request #1006 from segevfiner/git-ssh-command-putty
connect: recognize [tortoise]plink in GIT_SSH_COMMAND
2017-04-26 17:39:38 +02:00
Johannes Schindelin
45d791b950 Merge branch 'rebase-i-extra-v3'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:39:22 +02:00
Segev Finer
155bd38f1f connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
This environment variable and configuration value allow to
override the autodetection of plink/tortoiseplink in case that
Git gets it wrong.

[jes: wrapped overly-long lines, factored out and changed
get_ssh_variant() to handle_ssh_variant() to accomodate the
change from the putty/tortoiseplink variables to
port_option/needs_batch, adjusted the documentation, free()d
value obtained from the config.]

Signed-off-by: Segev Finer <segev208@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:39:17 +02:00
Johannes Schindelin
615c64f028 git_connect(): factor out SSH variant handling
We handle plink and tortoiseplink as OpenSSH replacements, by passing
the correct command-line options when detecting that they are used.

To let users override that auto-detection (in case Git gets it wrong),
we need to introduce new code to that end.

In preparation for this code, let's factor out the SSH variant handling
into its own function, handle_ssh_variant().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:39:17 +02:00
Johannes Schindelin
1fb0047424 Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
diffcore-rename: speed up register_rename_src
2017-04-26 17:39:16 +02:00
Junio C Hamano
4ef8663f61 connect: rename tortoiseplink and putty variables
One of these two may have originally been named after "what exact
SSH implementation do we have" so that we can tweak the command line
options, but these days "putty=1" no longer means "We are using the
plink SSH implementation that comes with PuTTY".  It is set when we
guess that either PuTTY plink or Tortoiseplink is in use.

Rename them after what effect is desired.  The current "putty"
option is about using "-P <port>" when OpenSSH would use "-p <port>",
so rename it to port_option whose value is either 'p' or 'P".  The
other one is about passing an extra command line option "-batch",
so rename it needs_batch.

[jes: wrapped overly-long line]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:39:16 +02:00
Segev Finer
ca55277637 connect: handle putty/plink also in GIT_SSH_COMMAND
Git for Windows has special support for the popular SSH client PuTTY:
when using PuTTY's non-interactive version ("plink.exe"), we use the -P
option to specify the port rather than OpenSSH's -p option. TortoiseGit
ships with its own, forked version of plink.exe, that adds support for
the -batch option, and for good measure we special-case that, too.

However, this special-casing of PuTTY only covers the case where the
user overrides the SSH command via the environment variable GIT_SSH
(which allows specifying the name of the executable), not
GIT_SSH_COMMAND (which allows specifying a full command, including
additional command-line options).

When users want to pass any additional arguments to (Tortoise-)Plink,
such as setting a private key, they are required to either use a shell
script named plink or tortoiseplink or duplicate the logic that is
already in Git for passing the correct style of command line arguments,
which can be difficult, error prone and annoying to get right.

This patch simply reuses the existing logic and expands it to cover
GIT_SSH_COMMAND, too.

Note: it may look a little heavy-handed to duplicate the entire
command-line and then split it, only to extract the name of the
executable. However, this is not a performance-critical code path, and
the code is much more readable this way.

Signed-off-by: Segev Finer <segev208@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:39:16 +02:00
Johannes Schindelin
084d35cb63 Merge pull request #1004 from whoisj/nolock-env
Carry non-locking status value in the environment.
2017-04-26 17:39:15 +02:00
Jeff Hostetler
29bf1a4a1d diffcore-rename: speed up register_rename_src
Teach register_rename_src() to see if new file pair
can simply be appended to the rename_src[] array before
performing the binary search to find the proper insertion
point.

This is a performance optimization.  This routine is called
during run_diff_files in status and the caller is iterating
over the sorted index, so we should expect to be able to
append in the normal case.  The existing insert logic is
preserved so we don't have to assume that, but simply take
advantage of it if possible.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2017-04-26 17:39:14 +02:00
J Wyman
4db79580fc Carry non-locking status value in the environment.
If the user has specified '--no-lock-index' when calling git-status, it only seems reasonable that the user intends that option to be carried through to any child forks/procs as well. Currently, the '--no-lock-status' call is lost when submodules are checked. This change places the desired option into the environment, which is in turn passed down to all subsequent children.

With cmd_status checking for '--no-lock--status' first from args then from environment, we're able to keep the option set in all children.

Signed-off-by: J Wyman <jeremy.wyman@microsoft.com>
2017-04-26 17:39:13 +02:00
Johannes Schindelin
9a2e4defb8 Merge pull request #1003 from shoelzer/master
poll: Use GetTickCount64 to avoid wraparound issues
2017-04-26 17:39:11 +02:00
Johannes Schindelin
c3792db197 poll: lazy-load GetTickCount64()
This fixes the compilation, actually, as we still did not make the jump to
post-Windows XP completely: we still compile with _WIN32_WINNT set to
0x0502 (which corresponds to Windows Server 2003 and is technically
greater than Windows XP's 0x0501).

However, GetTickCount64() is only available starting with Windows
Vista/Windows Server 2008.

Let's just lazy-load the function, which should also help Git for Windows
contributors who want to reinstate Windows XP support.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:39:10 +02:00
Steve Hoelzer
0502ba1478 poll: Use GetTickCount64 to avoid wraparound issues
From Visual Studio 2015 Code Analysis: Warning C28159 Consider using
'GetTickCount64' instead of 'GetTickCount'.

Reason: GetTickCount overflows roughly every 49 days. Code that does not
take that into account can loop indefinitely. GetTickCount64 operates on
64 bit values and does not have that problem.

Signed-off-by: Steve Hoelzer <shoelzer@gmail.com>
2017-04-26 17:39:10 +02:00
Johannes Schindelin
8b6d41da75 Merge pull request #159 from dscho/vagrant
Add Vagrant support (easy Linux VM setup)
2017-04-26 17:39:09 +02:00
Johannes Schindelin
c1dca8fe87 Merge branch 'jh/string-list-micro-optim'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:37:58 +02:00
Johannes Schindelin
c59781de15 Merge branch 'jh/add-index-entry-optim'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 17:36:49 +02:00
Johannes Schindelin
68f9bb0639 Merge 'rebase--helper' into HEAD 2017-04-26 16:58:55 +02:00
Johannes Schindelin
730f6138e2 rebase -i: rearrange fixup/squash lines using the rebase--helper
This operation has quadratic complexity, which is especially painful
on Windows, where shell scripts are *already* slow (mainly due to the
overhead of the POSIX emulation layer).

Let's reimplement this with linear complexity (using a hash map to
match the commits' subject lines) for the common case; Sadly, the
fixup/squash feature's design neglected performance considerations,
allowing arbitrary prefixes (read: `fixup! hell` will match the
commit subject `hello world`), which means that we are stuck with
quadratic performance in the worst case.

The reimplemented logic also happens to fix a bug where commented-out
lines (representing empty patches) were dropped by the previous code.

While at it, clarify how the fixup/squash feature works in `git rebase
-i`'s man page.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:43 +02:00
Johannes Schindelin
e36215d28e t3415: test fixup with wrapped oneline
The `git commit --fixup` command unwraps wrapped onelines when
constructing the commit message, without wrapping the result.

We need to make sure that `git rebase --autosquash` keeps handling such
cases correctly, in particular since we are about to move the autosquash
handling into the rebase--helper.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:43 +02:00
Johannes Schindelin
278175f8bf rebase -i: skip unnecessary picks using the rebase--helper
In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Note: The original code did not try to skip unnecessary picks of root
commits but punts instead (probably --root was not considered common
enough of a use case to bother optimizing). We do the same, for now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:43 +02:00
Johannes Schindelin
e41434c487 rebase -i: check for missing commits in the rebase--helper
In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:43 +02:00
Johannes Schindelin
6969ef475f t3404: relax rebase.missingCommitsCheck tests
These tests were a bit anal about the *exact* warning/error message
printed by git rebase. But those messages are intended for the *end
user*, therefore it does not make sense to test so rigidly for the
*exact* wording.

In the following, we will reimplement the missing commits check in
the sequencer, with slightly different words.

So let's just test for the parts in the warning/error message that
we *really* care about, nothing more, nothing less.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:42 +02:00
Johannes Schindelin
7ec5013a00 rebase -i: also expand/collapse the SHA-1s via the rebase--helper
This is crucial to improve performance on Windows, as the speed is now
mostly dominated by the SHA-1 transformation (because it spawns a new
rev-parse process for *every* line, and spawning processes is pretty
slow from Git for Windows' MSYS2 Bash).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:42 +02:00
Johannes Schindelin
0ab16fa130 rebase -i: do not invent onelines when expanding/collapsing SHA-1s
To avoid problems with short SHA-1s that become non-unique during the
rebase, we rewrite the todo script with short/long SHA-1s before and
after letting the user edit the script. Since SHA-1s are not intuitive
for humans, rebase -i also provides the onelines (commit message
subjects) in the script, purely for the user's convenience.

It is very possible to generate a todo script via different means than
rebase -i and then to let rebase -i run with it; In this case, these
onelines are not required.

And this is where the expand/collapse machinery has a bug: it *expects*
that oneline, and failing to find one reuses the previous SHA-1 as
"oneline".

It was most likely an oversight, and made implementation in the (quite
limiting) shell script language less convoluted. However, we are about
to reimplement performance-critical parts in C (and due to spawning a
git.exe process for every single line of the todo script, the
expansion/collapsing of the SHA-1s *is* performance-hampering on
Windows), therefore let's fix this bug to make cross-validation with the
C version of that functionality possible.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:42 +02:00
Johannes Schindelin
b3692471f0 rebase -i: remove useless indentation
The commands used to be indented, and it is nice to look at, but when we
transform the SHA-1s, the indentation is removed. So let's do away with it.

For the moment, at least: when we will use the upcoming rebase--helper
to transform the SHA-1s, we *will* keep the indentation and can
reintroduce it. Yet, to be able to validate the rebase--helper against
the output of the current shell script version, we need to remove the
extra indentation.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:42 +02:00
Johannes Schindelin
d6291564e9 rebase -i: generate the script via rebase--helper
The first step of an interactive rebase is to generate the so-called "todo
script", to be stored in the state directory as "git-rebase-todo" and to
be edited by the user.

Originally, we adjusted the output of `git log <options>` using a simple
sed script. Over the course of the years, the code became more
complicated. We now use shell scripting to edit the output of `git log`
conditionally, depending whether to keep "empty" commits (i.e. commits
that do not change any files).

On platforms where shell scripting is not native, this can be a serious
drag. And it opens the door for incompatibilities between platforms when
it comes to shell scripting or to Unix-y commands.

Let's just re-implement the todo script generation in plain C, using the
revision machinery directly.

This is substantially faster, improving the speed relative to the
shell script version of the interactive rebase from 2x to 3x on Windows.

Note that the rearrange_squash() function in git-rebase--interactive
relied on the fact that we set the "format" variable to the config setting
rebase.instructionFormat. Relying on a side effect like this is no good,
hence we explicitly perform that assignment (possibly again) in
rearrange_squash().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:42 +02:00
Johannes Schindelin
97d286af36 rebase -i: use the rebase--helper builtin
Now that the sequencer learned to process a "normal" interactive rebase,
we use it. The original shell script is still used for "non-normal"
interactive rebases, i.e. when --root or --preserve-merges was passed.

Please note that the --root option (via the $squash_onto variable) needs
special handling only for the very first command, hence it is still okay
to use the helper upon continue/skip.

Also please note that the --no-ff setting is volatile, i.e. when the
interactive rebase is interrupted at any stage, there is no record of
it. Therefore, we have to pass it from the shell script to the
rebase--helper.

Note: the test t3404 had to be adjusted because the the error messages
produced by the sequencer comply with our current convention to start with
a lower-case letter.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:41 +02:00
Johannes Schindelin
bed484b660 Add a builtin helper for interactive rebases
Git's interactive rebase is still implemented as a shell script, despite
its complexity. This implies that it suffers from the portability point
of view, from lack of expressibility, and of course also from
performance. The latter issue is particularly serious on Windows, where
we pay a hefty price for relying so much on POSIX.

Unfortunately, being such a huge shell script also means that we missed
the train when it would have been relatively easy to port it to C, and
instead piled feature upon feature onto that poor script that originally
never intended to be more than a slightly pimped cherry-pick in a loop.

To open the road toward better performance (in addition to all the other
benefits of C over shell scripts), let's just start *somewhere*.

The approach taken here is to add a builtin helper that at first intends
to take care of the parts of the interactive rebase that are most
affected by the performance penalties mentioned above.

In particular, after we spent all those efforts on preparing the sequencer
to process rebase -i's git-rebase-todo scripts, we implement the `git
rebase -i --continue` functionality as a new builtin, git-rebase--helper.

Once that is in place, we can work gradually on tackling the rest of the
technical debt.

Note that the rebase--helper needs to learn about the transient
--ff/--no-ff options of git-rebase, as the corresponding flag is not
persisted to, and re-read from, the state directory.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:41 +02:00
Johannes Schindelin
72f15f0009 Support Vagrant: quick & easy Linux virtual machine setup
When developing Git for Windows, we always have to ensure that we do not
break any non-Windows platforms, e.g. by introducing Windows-specific code
into the platform-independent source code.

At other times, it is necessary to test whether a bug is Windows-specific
or not, in order to send the bug report to the correct place. Having
access to a Linux-based Git comes in really handy in such a situation.

Vagrant offers a painless way to install and use a defined Linux
development environment on Windows (and other Operating Systems). We offer
a Vagrantfile to that end for two reasons:

1) To allow Windows users to gain the full power of Linux' Git

2) To offer users an easy path to verify that the issue they are about
   to report is really a Windows-specific issue; otherwise they would
   need to report it to git@vger.kernel.org instead.

Using it is easy: Download and install https://www.virtualbox.org/, then
download and install https://www.vagrantup.com/, then direct your
command-line window to the Git source directory containing the Vagrantfile
and run the commands:

	vagrant up
	vagrant ssh

See https://github.com/git-for-windows/git/wiki/Vagrant for details.

As part of switching Git for Windows' development environment from msysGit
to the MSys2-based Git SDK, this Vagrantfile was copy-edited from msysGit:

	https://github.com/msysgit/msysgit/blob/0be8f2208/Vagrantfile

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:31 +02:00
Jeff Hostetler
c4ad680d66 string-list: use ALLOC_GROW macro when reallocing string_list
Use ALLOC_GROW() macro when reallocing a string_list array
rather than simply increasing it by 32.  This is a performance
optimization.

During status on a very large repo and there are many changes,
a significant percentage of the total run time is spent
reallocing the wt_status.changes array.

This change decreases the time in wt_status_collect_changes_worktree()
from 125 seconds to 45 seconds on my very large repository.

This produced a modest gain on my 1M file artificial repo, but
broke even on linux.git.

Test                                            HEAD^^            HEAD
---------------------------------------------------------------------------------------
0005.2: read-tree status br_ballast (1000001)   8.29(5.62+2.62)   8.22(5.57+2.63) -0.8%

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-26 16:58:30 +02:00
Johannes Schindelin
87fd11281a Merge pull request #978 from jeffhostetler/jeffhostetler/thread_verify_hdr
read-cache: run verify_hdr() in background thread
2017-04-26 16:58:29 +02:00
Johannes Schindelin
4669599004 Merge 'misc-vs-fixes-extra' into HEAD
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:28 +02:00
Johannes Schindelin
c14b80f64d Merge branch 'visual-studio'
This topic branch teaches the project generator to generate a Visual
Studio solution, ready to be opened in Visual Studio 2010 or later.

The idea, of course, is to let some automatic build job generate and
commit the project files with

	make MSVC=1 vcxproj

and then (force-)push to a special-purpose branch.

The major part of this branch thicket concerns itself not only with
generating the Visual Studio project files, but making sure that the
user can then run the test suite from a regular Git Bash (i.e. *not*
requiring a Git for Windows SDK), e.g. by running

	cd t
	prove --timer --jobs 15 ./t[0-9]*.sh

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:28 +02:00
Johannes Schindelin
ced90cc7fa Merge pull request #971 from jeffhostetler/jeffhostetler/add_preload_fscache
add: use preload-index and fscache for performance
2017-04-26 16:58:27 +02:00
Johannes Schindelin
954c134040 mingw: make readlink() independent of core.symlinks
Regardless whether we think we are able to create symbolic links, we
should always read them.

This fixes https://github.com/git-for-windows/git/issues/958

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:27 +02:00
Johannes Schindelin
7a3b55acc2 Merge pull request #955 from jeffhostetler/jeffhostetler/preload_index_perf
preload-index: avoid lstat for skip-worktree items
2017-04-26 16:58:27 +02:00
Johannes Schindelin
c32f799c73 Merge pull request #938 from virtuald/patch-1
git-cvsexportcommit.perl: Force crlf translation
2017-04-26 16:58:26 +02:00
Johannes Schindelin
433f3dfd3a Merge branch 'reset-stdin'
This topic branch adds the (experimental) --stdin/-z options to `git
reset`. Those patches are still under review in the upstream Git project,
but are already merged in their experimental form into Git for Windows'
`master` branch, in preparation for a MinGit-only release.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:26 +02:00
Johannes Schindelin
b92825bc65 Merge branch 'mingw-strftime'
This topic branch works around an out-of-memory bug when the user
specified a format via --date=format:<format> that strftime() does
not like.

Reported by Stefan Naewe.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:25 +02:00
Johannes Schindelin
77ec3d19c9 Unbreak interactive GPG prompt upon signing
With the recent update in efee955 (gpg-interface: check gpg signature
creation status, 2016-06-17), we ask GPG to send all status updates to
stderr, and then catch the stderr in an strbuf.

But GPG might fail, and send error messages to stderr. And we simply
do not show them to the user.

Even worse: this swallows any interactive prompt for a passphrase. And
detaches stderr from the tty so that the passphrase cannot be read.

So while the first problem could be fixed (by printing the captured
stderr upon error), the second problem cannot be easily fixed, and
presents a major regression.

So let's just revert commit efee9553a4.

This fixes https://github.com/git-for-windows/git/issues/871

Cc: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:25 +02:00
Johannes Schindelin
a895a25071 Merge pull request #866 from landstander668/add_platform
Add reporting of build platform
2017-04-26 16:58:24 +02:00
Johannes Schindelin
ca715bedd0 Merge branch 'unhidden-git'
It has been reported that core.hideDotFiles=false stopped working...
This topic branch fixes it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:24 +02:00
Johannes Schindelin
21453f7909 Merge branch 'status-no-lock-index'
This branch allows third-party tools to call `git status
--no-lock-index` to avoid lock contention with the interactive Git usage
of the actual human user.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:23 +02:00
Johannes Schindelin
a9f6746e02 Merge pull request #797 from glhez/master
`git bundle create <bundle>` leaks handle the revlist is empty.
2017-04-26 16:58:23 +02:00
Johannes Schindelin
8f83514356 Merge 'release-gc-repack' into HEAD 2017-04-26 16:58:22 +02:00
Johannes Schindelin
4b1a49d825 Merge branch 'spawn-with-spaces'
This change lets us spawn .bat scripts whose paths contain spaces.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:58:21 +02:00
Jeff Hostetler
7417d84e46 read-cache: speed up has_dir_name (part 2)
Teach has_dir_name() to see if the path of the new item
is greater than the last path in the index array before
attempting to search for it.

has_dir_name() is looking for file/directory collisions
in the index and has to consider each sub-directory
prefix in turn.  This can cause multiple binary searches
for each path.

During operations like checkout, merge_working_tree()
populates the new index in sorted order, so we expect
to be able to append in many cases.

This commit is part 2 of 2.  This commit handles the
additional possible short-cuts as we look at each
sub-directory prefix.

The net-net gains for add_index_entry_with_check() and
both had_dir_name() commits are best seen for very
large repos.

Here are results for an INFLATED version of linux.git
with 1M files.

$ GIT_PERF_REPO=/mnt/test/linux_inflated.git/ ./run upstream/base HEAD ./p0006-read-tree-checkout.sh
Test                                                            upstream/base      HEAD
0006.2: read-tree br_base br_ballast (1043893)                  3.79(3.63+0.15)    2.68(2.52+0.15) -29.3%
0006.3: switch between br_base br_ballast (1043893)             7.55(6.58+0.44)    6.03(4.60+0.43) -20.1%
0006.4: switch between br_ballast br_ballast_plus_1 (1043893)   10.84(9.26+0.59)   8.44(7.06+0.65) -22.1%
0006.5: switch between aliases (1043893)                        10.93(9.39+0.58)   10.24(7.04+0.63) -6.3%

Here are results for a synthetic repo with 4.2M files.

$ GIT_PERF_REPO=~/work/gfw/t/perf/repos/gen-many-files-10.4.3.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test                                                            HEAD~3               HEAD
0006.2: read-tree br_base br_ballast (4194305)                  29.96(19.26+10.50)   23.76(13.42+10.12) -20.7%
0006.3: switch between br_base br_ballast (4194305)             56.95(36.08+16.83)   45.54(25.94+15.68) -20.0%
0006.4: switch between br_ballast br_ballast_plus_1 (4194305)   90.94(51.50+31.52)   78.22(39.39+30.70) -14.0%
0006.5: switch between aliases (4194305)                        93.72(51.63+34.09)   77.94(39.00+30.88) -16.8%

Results for medium repos (like linux.git) are mixed and have
more variance (probably do to disk IO unrelated to this test.

$ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test                                                          HEAD~3             HEAD
0006.2: read-tree br_base br_ballast (57994)                  0.25(0.21+0.03)    0.20(0.17+0.02) -20.0%
0006.3: switch between br_base br_ballast (57994)             10.67(6.06+2.92)   10.51(5.94+2.91) -1.5%
0006.4: switch between br_ballast br_ballast_plus_1 (57994)   0.59(0.47+0.16)    0.52(0.40+0.13) -11.9%
0006.5: switch between aliases (57994)                        0.59(0.44+0.17)    0.51(0.38+0.14) -13.6%

$ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test                                                          HEAD~3             HEAD
0006.2: read-tree br_base br_ballast (57994)                  0.24(0.21+0.02)    0.21(0.18+0.02) -12.5%
0006.3: switch between br_base br_ballast (57994)             10.42(5.98+2.91)   10.66(5.86+3.09) +2.3%
0006.4: switch between br_ballast br_ballast_plus_1 (57994)   0.59(0.49+0.13)    0.53(0.37+0.16) -10.2%
0006.5: switch between aliases (57994)                        0.59(0.43+0.17)    0.50(0.37+0.14) -15.3%

Results for smaller repos (like git.git) are not significant.
$ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test                                                         HEAD~3            HEAD
0006.2: read-tree br_base br_ballast (3043)                  0.01(0.00+0.00)   0.01(0.00+0.00) +0.0%
0006.3: switch between br_base br_ballast (3043)             0.31(0.17+0.11)   0.29(0.19+0.08) -6.5%
0006.4: switch between br_ballast br_ballast_plus_1 (3043)   0.03(0.02+0.00)   0.03(0.02+0.00) +0.0%
0006.5: switch between aliases (3043)                        0.03(0.02+0.00)   0.03(0.02+0.00) +0.0%

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-26 16:58:20 +02:00