It is unlikely that we have an empty environment, ever, but *if* we do,
when `environ_size - 1` is passed to `bsearchenv()` it is misinterpreted
as a real large integer.
To make the code truly defensive, refuse to do anything at all if the
size is negative (which should not happen, of course).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As suggested privately to Brendan Forster by some unnamed person
(suggestion for the future: use the public mailing list, or even the
public GitHub issue tracker, that is a much better place to offer such
suggestions), we should make use of gcc's stack smashing protector that
helps detect stack buffer overruns early.
Rather than using -fstack-protector, we use -fstack-protector-strong
because it strikes a better balance between how much code is affected
and the performance impact.
In a local test (time git log --grep=is -p), best of 5 timings went from
23.009s to 22.997s (i.e. the performance impact was *well* lost in the
noise).
This fixes https://github.com/git-for-windows/git/issues/501
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows >= Vista, not having an application manifest with a
requestedExecutionLevel can cause several kinds of confusing behavior.
The first and more obvious behavior is "Installer Detection", where
Windows sometimes decides (by looking at things like the file name and
even sequences of bytes within the executable) that an executable is an
installer and should run elevated (causing the well-known popup dialog
to appear). In Git's context, subcommands such as "git patch-id" or "git
update-index" fall prey to this behavior.
The second and more confusing behavior is "File Virtualization". It
means that when files are written without having write permission, it
does not fail (as expected), but they are instead redirected to
somewhere else. When the files are read, the original contents are
returned, though, not the ones that were just written somewhere else.
Even more confusing, not all write accesses are redirected; Trying to
write to write-protected .exe files, for example, will fail instead of
redirecting.
In addition to being unwanted behavior, File Virtualization causes
dramatic slowdowns in Git (see for instance
http://code.google.com/p/msysgit/issues/detail?id=320).
There are two ways to prevent those two behaviors: Either you embed an
application manifest within all your executables, or you add an external
manifest (a file with the same name followed by .manifest) to all your
executables. Since Git's builtins are hardlinked (or copied), it is
simpler and more robust to embed a manifest.
A recent enough MSVC compiler should already embed a working internal
manifest, but for MinGW you have to do so by hand.
Very lightly tested on Wine, where like on Windows XP it should not make
any difference.
References:
- New UAC Technologies for Windows Vista
http://msdn.microsoft.com/en-us/library/bb756960.aspx
- Create and Embed an Application Manifest (UAC)
http://msdn.microsoft.com/en-us/library/bb756929.aspx
[js: simplified the embedding dramatically by reusing Git for Windows'
existing Windows resource file, removed the optional (and dubious)
processorArchitecture attribute of the manifest's assemblyIdentity
section.]
Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
There is a really useful debugging technique developed by Sverre
Rabbelier that inserts "bash &&" somewhere in the test scripts, letting
the developer interact at given points with the current state.
Another debugging technique, used a lot by this here coder, is to run
certain executables via gdb by guarding a "gdb -args" call in
bin-wrappers/git.
Both techniques were disabled by 781f76b1(test-lib: redirect stdin of
tests).
Let's reinstate the ability to run an interactive shell by making the
redirection optional: setting the TEST_NO_REDIRECT environment variable
will skip the redirection.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The unsetenv code has no idea to update our environ_size, therefore
causing segmentation faults when environment variables are removed
without compat/mingw.c's knowing (MinGW's optimized lookup would try
to strcmp() against NULL in such a case).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
MSys2's strace facility is very useful for debugging... With this patch,
the bash will be executed through strace if the environment variable
GIT_STRACE_COMMANDS is set, which comes in real handy when investigating
issues in the test suite.
Also support passing a path to a log file via GIT_STRACE_COMMANDS to
force Git to call strace.exe with the `-o <path>` argument, i.e. to log
into a file rather than print the log directly.
That comes in handy when the output would otherwise misinterpreted by a
calling process as part of Git's output.
Note: the values "1", "yes" or "true" are *not* specifying paths, but
tell Git to let strace.exe log directly to the console.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This way the libraries get properly installed into the "site_perl"
directory and we just have to move them out of the "mingw" directory.
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
With this topic branch, the PERL5LIB variable is unset to avoid external
settings from interfering with Git's own Perl interpreter.
This branch also cleans up some of our Windows-only config setting code
(and this will need to be rearranged in the next merging rebase so that
the cleanup comes first, and fscache and longPaths support build on
top).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.
On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.
Signed-off-by: Karsten Blees <blees@dcon.de>
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).
Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.
Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).
Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).
Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).
Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.
Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.
While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.
Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199
Note that the test cannot rely on the presence of short names, as they
are not enabled by default except on the system drive.
[jes: adjusted test number to avoid conflicts, reinstated && chain,
adjusted test to work without short names]
Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.
Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.
Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.
The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.
Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).
Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.
Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).
Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).
Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).
Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.
Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.
While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.
Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199
[jes: adjusted test number to avoid conflicts]
Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
[jes: adusted test number to avoid conflicts, fixed non-portable use of
the 'export' statement]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.
Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Signed-off-by: Karsten Blees <blees@dcon.de>
Git for Windows ships with its own Perl interpreter, and insists on
using it, so it will most likely wreak havoc if PERL5LIB is set before
launching Git.
Let's just unset that environment variables when spawning processes.
To make this feature extensible (and overrideable), there is a new
config setting `core.unsetenvvars` that allows specifying a
comma-separated list of names to unset before spawning processes.
Reported by Gabriel Fuhrmann.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.
Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.
Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).
Signed-off-by: Karsten Blees <blees@dcon.de>
In the Git for Windows project, we have ample precendent for config
settings that apply to Windows, and to Windows only.
Let's formalize this concept by introducing a platform_core_config()
function that can be #define'd in a platform-specific manner.
This will allow us to contain platform-specific code better, as the
corresponding variables no longer need to be exported so that they can
be defined in environment.c and be set in config.c
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is the convention elsewhere (and prepares for the case where we may
need to pass callback data).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A recently introduced regression caused a segfault at clone time on
case-insensitive filesystems when filenames differing only in case are
present. This bug has already been fixed (repository: pre-initialize
hash algo pointer, 2018-01-18), but it's not the first time similar
problems have arisen. Therefore, introduce a test to catch this case and
protect against future regressions.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are various git subcommands (among them, clone) which don't set up
the repository (that is, they lack RUN_SETUP or RUN_SETUP_GENTLY) but
end up needing to have information about the hash algorithm in use.
Because the hash algorithm is part of struct repository and it's only
initialized in repository setup, we can end up dereferencing a NULL
pointer in some cases if we call one of these subcommands and look up
the empty blob or empty tree values.
A "git clone" of a project that has two paths that differ only in
case suffers from this if it is run on a case insensitive platform.
When the command attempts to check out one of these two paths after
checking out the other one, the checkout codepath needs to see if
the version that is already on the filesystem (which should not
happen if the FS were case sensitive) is dirty, and it needs to
exercise the hashing code at that point.
In the future, we can add a command line option for this or read it
from the configuration, but until we're ready to expose that
functionality to the user, simply initialize the repository
structure to use the current hash algorithm, SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Take a hint from commit ea68b0ce9f (hash-object: don't use mmap() for
small files, 2010-02-21) and use read() instead of mmap() for small
packed-refs files.
This also fixes https://github.com/git-for-windows/git/issues/1410
(where xmmap() returns NULL for zero length[1], for which munmap() later
fails).
Alternatively, we could simply check for NULL before munmap(), or
introduce xmunmap() that could be used together with xmmap(). However,
always setting snapshot->buf to a valid pointer, by relying on
xmalloc(0)'s fallback to 1-byte allocation, makes using snapshots
easier.
[1] Logic introduced in commit 9130ac1e19 (Better error messages for
corrupt databases, 2007-01-11)
This was cherry-picked from upstream's `pu` branch so that the fix is
included in Git for Windows v2.16.0.
Signed-off-by: Kim Gybels <kgybels@infogroep.be>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In git reset --hard, unpack-trees() is called with oneway_merge().
oneway_merge calls lstat for each files in a repository.
It is bottleneck of git reset --hard, especially in large repository.
This patch improves time by using fscache.
In chromium repository, time of git reset --hard is changed like below.
I took 3 times stats in the repository.
master:
TotalSeconds: 21.0337971
TotalSeconds: 20.0046612
TotalSeconds: 20.6501752
Avg: 20.5628778333333
this patch:
TotalSeconds: 4.8552376
TotalSeconds: 4.8722343
TotalSeconds: 4.9268245
Avg: 4.88476546666667
Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
Especially in huge code bases with fast-moving `master`, it can be
prohibitively expensive to calculate whether an upstream branch of
a local branch is ahead, behind or diverged.
This topic branch introduces a set of flags to avoid that computation
when we're not even interested in it to begin with.
This merge commit takes the feature early, therefore it is marked
experimental.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When I do git fetch, git call file stats under .git/objects for each
refs. This takes time when there are many refs.
By enabling fscache, git takes file stats by directory traversing and that
improved the speed of fetch-pack for repository having large number of
refs.
In my windows workstation, this improves the time of `git fetch` for
chromium repository like below. I took stats 3 times.
* With this patch
TotalSeconds: 9.9825165
TotalSeconds: 9.1862075
TotalSeconds: 10.1956256
Avg: 9.78811653333333
* Without this patch
TotalSeconds: 15.8406702
TotalSeconds: 15.6248053
TotalSeconds: 15.2085938
Avg: 15.5580231
Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
Teach long (normal) status format to respect the --no-ahead-behind
parameter and skip the possibly expensive ahead/behind computation
between the branch and the upstream.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach "git status --short --branch" to respect "--no-ahead-behind"
parameter to skip computing ahead/behind counts for the branch and
its upstream and just report '[different]'.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach "git status" and "git commit" to accept "--no-ahead-behind"
and "--ahead-behind" arguments to request quick or full ahead/behind
reporting.
When "--no-ahead-behind" is given, the existing porcelain V2 line
"branch.ab +x -y" is replaced with a new "branch.ab +? -?" line.
This indicates that the branch and its upstream are or are not equal
without the expense of computing the full ahead/behind values.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Once upon a time, git-clone would refuse to write into a
directory that it did not itself create. The cleanup
routines for a failed clone could therefore just remove the
git and worktree dirs completely.
In 55892d2398 (Allow cloning to an existing empty directory,
2009-01-11), we learned to write into an existing directory.
Which means that doing:
mkdir foo
git clone will-fail foo
ends up deleting foo. This isn't a huge catastrophe, since
by definition foo must be empty. But it's somewhat
confusing; we should leave the filesystem as we found it.
Because we know that the only directory we'll write into is
an empty one, we can handle this case by just passing the
KEEP_TOPLEVEL flag to our recursive delete (if we could
write into populated directories, we'd have to keep track of
what we wrote and what we did not).
Note that we need to handle the work-tree and git-dir
separately, though, as only one might exist (and the new
tests in t5600 cover all cases).
Reported-by: Stephan Janssen <sjanssen@you-get.com>
Extend stat_tracking_info() to return +1 when branches are not equal and to
take a new "enum ahead_behind_flags" argument to allow skipping the (possibly
expensive) ahead/behind computation.
This will be used in the next commit to allow "git status" to avoid full
ahead/behind calculations for performance reasons.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Two parts of git-clone's setup logic check whether a
directory exists, and they both call stat directly the same
scratch "struct stat" buffer. Let's pull that into a helper,
which has a few advantages:
- it makes the purpose of the stat call more obvious
- it makes it clear that we don't care about the
information in "buf" remaining valid
- if we later decide to make the check more robust (e.g.,
complaining about non-directories), we can do it in one
place
Note that we could just use file_exists() for this, which
has identical code. But we specifically care about
directories, so this future-proofs us against that function
later getting more picky about actual files.
This is an old script which could use some updating before
we add to it:
- use the standard line-breaking:
test_expect_success 'title' '
body
'
- run all code inside test_expect blocks to catch
unexpected failures in setup steps
- use "test_commit -C" instead of manually entering
sub-repo
- use test_when_finished for cleanup steps
- test_path_is_* as appropriate