If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.
On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.
Signed-off-by: Karsten Blees <blees@dcon.de>
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.
Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.
Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.
The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.
Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.
Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Signed-off-by: Karsten Blees <blees@dcon.de>
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.
Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.
Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).
Signed-off-by: Karsten Blees <blees@dcon.de>
With msysGit the .git directory is supposed to be hidden, unless it is
a bare git repository. Test this.
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
At least for cross-platform projects, it makes sense to hide the
files starting with a dot, as this is the behavior on Unix/MacOSX.
However, at least Eclipse has problems interpreting the hidden flag
correctly, so the default is to hide only the .git/ directory.
The config setting core.hideDotFiles therefore supports not only
'true' and 'false', but also 'dotGitOnly'.
[jes: clarified the commit message, made git init respect the setting
by marking the .git/ directory only after reading the config, and added
documentation, and rebased on top of current junio/next]
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `git_terminal_prompt()` function expects the terminal window to be
attached to a Win32 Console. However, this is not the case with terminal
windows other than `cmd.exe`'s, e.g. with MSys2's own `mintty`.
Non-cmd terminals such as `mintty` still have to have a Win32 Console
to be proper console programs, but have to hide the Win32 Console to
be able to provide more flexibility (such as being resizeable not only
vertically but also horizontally). By writing to that Win32 Console,
`git_terminal_prompt()` manages only to send the prompt to nowhere and
to wait for input from a Console to which the user has no access.
This commit introduces a function specifically to support `mintty` -- or
other terminals that are compatible with MSys2's `/dev/tty` emulation. We
use the `TERM` environment variable as an indicator for that: if the value
starts with "xterm" (such as `mintty`'s "xterm_256color"), we prefer to
let `xterm_prompt()` handle the user interaction.
To handle the case when standard input/output are redirected – as is the
case when pushing via HTTPS: `git-remote-https`' standard input and
output are pipes from/to the main Git executable – we make use of the
`MSYS_TTY_HANDLES` environment variable that was introduced to
fix another bug in MSys2-based Git: this environment variable contains
the Win32 `HANDLE`s of the standard input, output and error as originally
passed from MSys2 to the Git executable, enclosed within space
characters, skipping handles that do not refer to the terminal window
(e.g. when they were redirected). We will only use those handles when
that environment variable lists all three handles because then we can be
100% certain that we are running inside a terminal window, and that we
know exactly which Win32 handles to use to communicate with it.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: nalla <nalla@hamal.uberspace.de>
The environment is modified in most surprising circumstances, and not
all of them are under Git's control. For example, calling
curl_global_init() on Windows will ensure that the CHARSET variable is
set, adding one if necessary.
While the previous commit worked around crashes triggered by such
outside changes of the environment by relaxing the requirement that the
environment be terminated by a NULL pointer, the other assumption made
by `mingw_getenv()` and `mingw_putenv()` is that the environment is
sorted, for efficient lookup via binary search.
Let's make real sure that our environment is intact before querying or
modifying it, and reinitialize our idea of the environment if necessary.
With this commit, before working on the environment we look briefly for
indicators that the environment was modified outside of our control, and
to ensure that it is terminated with a NULL pointer and sorted again in
that case.
Note: the indicators are maybe not sufficient. For example, when a
variable is removed, it will not be noticed. It might also be a problem
if outside changes to the environment result in a modified `environ`
pointer: it is unclear whether such a modification could result in a
problem when `mingw_putenv()` needs to `realloc()` the environment
buffer.
For the moment, however, the current fix works well enough, so let's
only face the potential problems when (and if!) they occur.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Outside of our Windows-specific code, the end of the environment can be
marked also by a pointer to a NUL character, not only by a NULL pointer
as our code assumed so far.
That led to a buffer overrun in `make_environment_block()` when running
`git-remote-https` in `mintty` (because `curl_global_init()` added the
`CHARSET` environment variable *outside* of `mingw_putenv()`, ending the
environment in a pointer to an empty string).
Side note for future debugging on Windows: when running programs in
`mintty`, the standard input/output/error is not connected to a Win32
Console, but instead is pipe()d. That means that even stderr may not be
written completely before a crash, but has to be fflush()ed explicitly.
For example, when debugging crashes, the developer should insert an
`fflush(stderr);` at the end of the `error()` function defined in
usage.c.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This was originally 'pull request #330 from ethomson/poll_inftim' in
msysgit/git.
poll: honor the timeout on Win32
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Ensure that when passing a pipe, the gnulib poll replacement will not
return 0 before the timeout has passed.
Not obeying the timeout (and merely returning 0) causes pathological
behavior when preparing a packfile for a repository and taking a
long time to do so. If poll were to return 0 immediately, this would
cause keep-alives to get sent as quickly as possible until the packfile
was created. Such deviance from the standard would cause megabytes (or
more) of keep-alive packets to be sent.
GetTickCount is used as it is efficient, stable and monotonically
increasing. (Neither GetSystemTime nor QueryPerformanceCounter have
all three of these properties.)
This fixes the MSYS1-based build which otherwise would have the variables
but not the build targets.
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Embed the manifest in the Git wrapper, too
This is needed for builtins such as 'patch-id' to avoid triggering
Windows' User Access Control.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This reduces the disk footprint of a full Git for Windows setup
dramatically because on Windows, one cannot assume that hard links are
supported.
The net savings are calculated easily: the 32-bit `git.exe` file weighs
in with 7662 kB while the `git-wrapper.exe` file (modified to serve as a
drop-in replacement for builtins) weighs a scant 21 kB. At this point,
there are 109 builtins which results in a total of 813 MB disk space
being freed up by this commit.
Yes, that is really more than half a gigabyte.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git started out as a bunch of separate commands, in the true Unix spirit.
Over time, more and more functionality was shared between the different
Git commands, though, so it made sense to introduce the notion of
"builtins": programs that are actually integrated into the main Git
executable.
These builtins can be called in two ways: either by specifying a
subcommand as the first command-line argument, or -- for backwards
compatibility -- by calling the Git executable hardlinked to a filename
of the form "git-<subcommand>". Example: the "log" command can be called
via "git log <parameters>" or via "git-log <parameters>". The latter
form is actually deprecated and only supported for scripts; calling
"git-log" interactively will not even work by default because the
libexec/git-core/ directory is not in the PATH.
All of this is well and groovy as long as hard links are supported.
Sadly, this is not the case in general on Windows. So it actually hurts
quite a bit when you have to fall back to copying all of git.exe's
currently 7.5MB 109 times, just for backwards compatibility.
The simple solution would be to install really trivial shell script
wrappers in place of the builtins:
for builtin in $BUILTINS
do
rm git-$builtin.exe
printf '#!/bin/sh\nexec git %s "$@"\n' $builtin > git-builtin
chmod a+x git-builtin
done
This method would work -- even on Windows because Git for Windows ships a
full-fledged Bash. However, the Windows Bash comes at a price: it needs to
spin up a full-fledged POSIX emulation layer everytime it starts.
Therefore, the shell script solution would incur a significant performance
penalty.
The best solution the Git for Windows team could come up with is to extend
the Git wrapper -- that is needed to call Git from cmd.exe anyway, and
that weighs in with a scant 19KB -- to also serve as a drop-in replacement
for the builtins so that the following workaround is satisfactory:
for builtin in $BUILTINS
do
cp git-wrapper.exe git-$builtin.exe
done
This commit allows for this, by extending the module file parsing to
turn builtin command names like `git-log.exe ...` into calls to the main
Git executable: `git.exe log ...`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This prepares the wrapper for modifications to serve as a drop-in
replacement for the builtins.
This commit's diff is best viewed with the `-w` flag.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It is unlikely that we have an empty environment, ever, but *if* we do,
when `environ_size - 1` is passed to `bsearchenv()` it is misinterpreted
as a real large integer.
To make the code truly defensive, refuse to do anything at all if the
size is negative (which should not happen, of course).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This change simply makes the Git wrapper's source code conform to Git's
coding guide lines.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Test clean-up.
* jc/diff-test-updates:
test_ln_s_add: refresh stat info of fake symbolic links
t4008: modernise style
t/diff-lib: check exact object names in compare_diff_raw
tests: do not borrow from COPYING and README from the real source
t4010: correct expected object names
t9300: correct expected object names
t4008: correct stale comments
A corrupt input to "git diff -M" can cause us to segfault.
* jk/diffcore-rename-duplicate:
diffcore-rename: avoid processing duplicate destinations
diffcore-rename: split locate_rename_dst into two functions
The borrowed code in kwset API did not follow our usual convention
to use "unsigned char" to store values that range from 0-255.
* bw/kwset-use-unsigned:
kwset: use unsigned char to store values with high-bit set
Description given by "grep -h" for its --exclude-standard option
was phrased poorly.
* nd/grep-exclude-standard-help-fix:
grep: correct help string for --exclude-standard
"git remote add" mentioned "--tags" and "--no-tags" and was not
clear that fetch from the remote in the future will use the default
behaviour when neither is given to override it.
* mg/doc-remote-tags-or-not:
git-remote.txt: describe behavior without --tags and --no-tags
"git diff --shortstat --dirstat=changes" showed a dirstat based on
lines that was never asked by the end user in addition to the
dirstat that the user asked for.
* mk/diff-shortstat-dirstat-fix:
diff --shortstat --dirstat: remove duplicate output
The interaction between "git submodule update" and the
submodule.*.update configuration was not clearly documented.
* ms/submodule-update-config-doc:
submodule: improve documentation of update subcommand
"git apply" was not very careful about reading from, removing,
updating and creating paths outside the working tree (under
--index/--cached) or the current directory (when used as a
replacement for GNU patch).
* jc/apply-beyond-symlink:
apply: do not touch a file beyond a symbolic link
apply: do not read from beyond a symbolic link
apply: do not read from the filesystem under --index
apply: reject input that touches outside the working area
"git daemon" looked up the hostname even when "%CH" and "%IP"
interpolations are not requested, which was unnecessary.
* rs/daemon-interpolate:
daemon: use callback to build interpolated path
daemon: look up client-supplied hostname lazily
The "interpolated-path" option of "git daemon" inserted any string
client declared on the "host=" capability request without checking.
Sanitize and limit %H and %CH to a saner and a valid DNS name.
* jk/daemon-interpolate:
daemon: sanitize incoming virtual hostname
t5570: test git-daemon's --interpolated-path option
git_connect: let user override virtual-host we send to daemon
On Windows, Git is faced by the challenge that it has to set up certain
environment variables before running Git under special circumstances
such as when Git is called directly from cmd.exe (i.e. outside any
Bash environment).
This source code was taken from msysGit's commit 74a198d:
https://github.com/msysgit/msysgit/blob/74a198d/src/git-wrapper/git-wrapper.c
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
If we use 'eval exec $opt $cmdp $args' to execute git command,
tcl engine will convert the output of the git comand with the rule
system default code page to unicode.
But cp936 -> unicode conversion implicitly done by exec is not reversible.
So we have to use git_read instead.
Bug report and an original reproducer by Cloud Chou:
https://github.com/msysgit/git/issues/302
Karsten Blees writes this code patch.
Cloud Chou find the reason of the bug.
Thanks-to: dscho
Thanks-to: patthoyts
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Cloud Chou <515312382@qq.com>
Signed-off-by: Cloud Chou <515312382@qq.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
MSys2 has a slightly different notion of what constitutes a tty than
the Microsoft C runtime. The former knows whether stdin/stdout/stderr
was redirected or not, while the latter looks for a Win32 Console.
In particular when we want to know whether to spawn a pager or not, we
would rather want to know what MSys2 thinks.
We are about to introduce a change to the msys2-runtime that sets an
environment variable MSYS_TTY_HANDLES to a list of Win32 handles that
correspond to stdin/stdout/stderr, respectively, *but skips* handles that
MSys2 does not think are terminals.
This commit handles that input to augment the isatty() function to return
1 also when MSYS_TTY_HANDLES contains the corresponding handle.
The only time when Git needs to know whether a Console is attached or not
is when winansi.c is asked to Do Its Thing, therefore we refrain from
overriding isatty there.
Note: this was an issue with MSys1-based Git for Windows, too, hidden by
the fact that Git for Windows used `cmd.exe` as a terminal -- which is
backed by a real Win32 Console. Had MSys1 used, say, rxvt as its default
terminal, the symptom would have been that "git log" does not spawn a
pager by default but instead outputs the entire history (without color
coding, too). In MSys2, the default terminal is mintty, therefore we
finally could not avoid to address the issue.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>