Commit Graph

92306 Commits

Author SHA1 Message Date
Johannes Schindelin
5840c4c8ff Merge pull request #1426 from atetubou/fetch_pack
fetch-pack.c: enable fscache for stats under .git/objects
2018-11-23 10:00:06 +01:00
Johannes Schindelin
215cb74cf9 Merge pull request #1344 from jeffhostetler/perf_add_excludes_with_fscache
dir.c: make add_excludes aware of fscache during status
2018-11-23 10:00:06 +01:00
Johannes Schindelin
61ee25badf Merge pull request #971 from jeffhostetler/jeffhostetler/add_preload_fscache
add: use preload-index and fscache for performance
2018-11-23 10:00:05 +01:00
Johannes Schindelin
e69fee0719 Merge branch 'core-longpaths-everywhere'
Git for Windows supports the core.longPaths config setting to allow
writing/reading long paths via the \\?\ trick for a long time now.

However, for that support to work, it is absolutely necessary that
git_default_config() is given a chance to parse the config. Otherwise
Git will be non the wiser.

So let's make sure that as many commands that previously failed to
parse the core.* settings now do that, implicitly enabling long path
support in a lot more places.

Note: this is not a perfect solution, and it cannot be, as there is
a chicken-and-egg problem in reading the config itself...

This fixes https://github.com/git-for-windows/git/issues/1218

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 10:00:05 +01:00
Johannes Schindelin
3a8e144cc9 Merge pull request #994 from jeffhostetler/jeffhostetler/fscache_nfd
fscache: add not-found directory cache to fscache
2018-11-23 10:00:05 +01:00
Johannes Schindelin
5123a6b03d Merge branch 'spawn-with-spaces'
This change lets us spawn .bat scripts whose paths contain spaces.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 10:00:05 +01:00
Johannes Schindelin
7bce8162ee Merge branch 'clean-long-paths'
This addresses https://github.com/git-for-windows/git/issues/521

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 10:00:04 +01:00
Johannes Schindelin
5d4006f31c Merge pull request #305 from dscho/msysgit_issues_182
Allow `add -p` and `add -i` with a large number of files
2018-11-23 10:00:04 +01:00
Johannes Schindelin
6880384030 Merge branch 'program-data-config'
This branch introduces support for reading the "Windows-wide" Git
configuration from `%PROGRAMDATA%\Git\config`. As these settings are
intended to be shared between *all* Git-related software, that config
file takes an even lower precedence than `$(prefix)/etc/gitconfig`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 10:00:04 +01:00
Johannes Schindelin
d8895f5dde Merge pull request #1897 from piscisaureus/symlink-attr
Specify symlink type in .gitattributes
2018-11-23 10:00:03 +01:00
Johannes Schindelin
cf4024624f Merge branch 'kblees/kb/symlinks' 2018-11-23 10:00:03 +01:00
Johannes Schindelin
a7e1673d41 Merge branch 'msys2' 2018-11-23 10:00:03 +01:00
Johannes Schindelin
2c66d73509 Merge branch 'long-paths' 2018-11-23 10:00:03 +01:00
Johannes Schindelin
ee051e5d7e Merge branch 'fscache' 2018-11-23 10:00:02 +01:00
Jeff Hostetler
aab93e7395 Merge branch 'visual-studio'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 10:00:02 +01:00
Jeff Hostetler
712aea5541 Merge branch 'msvc'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 10:00:02 +01:00
Johannes Schindelin
f6302fdf96 Merge branch 'gitk-and-git-gui-patches'
These are Git for Windows' Git GUI and gitk patches. We will have to
decide at some point what to do about them, but that's a little lower
priority (as Git GUI seems to be unmaintained for the time being, and
the gitk maintainer keeps a very low profile on the Git mailing list,
too).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 10:00:01 +01:00
Johannes Schindelin
c605dfb7f8 Merge branch 'ready-for-upstream'
This is the branch thicket of patches in Git for Windows that are
considered ready for upstream. To keep them in a ready-to-submit shape,
they are kept as close to the beginning of the branch thicket as
possible.
2018-11-23 10:00:01 +01:00
Takuto Ikuta
82ea6bda3e fetch-pack.c: enable fscache for stats under .git/objects
When I do git fetch, git call file stats under .git/objects for each
refs. This takes time when there are many refs.

By enabling fscache, git takes file stats by directory traversing and that
improved the speed of fetch-pack for repository having large number of
refs.

In my windows workstation, this improves the time of `git fetch` for
chromium repository like below. I took stats 3 times.

* With this patch
TotalSeconds: 9.9825165
TotalSeconds: 9.1862075
TotalSeconds: 10.1956256
Avg: 9.78811653333333

* Without this patch
TotalSeconds: 15.8406702
TotalSeconds: 15.6248053
TotalSeconds: 15.2085938
Avg: 15.5580231

Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
2018-11-23 09:59:48 +01:00
Jeff Hostetler
db4db2192c dir.c: regression fix for add_excludes with fscache
Fix regression described in:
https://github.com/git-for-windows/git/issues/1392

which was introduced in:
b2353379bb

Problem Symptoms
================
When the user has a .gitignore file that is a symlink, the fscache
optimization introduced above caused the stat-data from the symlink,
rather that of the target file, to be returned.  Later when the ignore
file was read, the buffer length did not match the stat.st_size field
and we called die("cannot use <path> as an exclude file")

Optimization Rationale
======================
The above optimization calls lstat() before open() primarily to ask
fscache if the file exists.  It gets the current stat-data as a side
effect essentially for free (since we already have it in memory).
If the file does not exist, it does not need to call open().  And
since very few directories have .gitignore files, we can greatly
reduce time spent in the filesystem.

Discussion of Fix
=================
The above optimization calls lstat() rather than stat() because the
fscache only intercepts lstat() calls.  Calls to stat() stay directed
to the mingw_stat() completly bypassing fscache.  Furthermore, calls
to mingw_stat() always call {open, fstat, close} so that symlinks are
properly dereferenced, which adds *additional* open/close calls on top
of what the original code in dir.c is doing.

Since the problem only manifests for symlinks, we add code to overwrite
the stat-data when the path is a symlink.  This preserves the effect of
the performance gains provided by the fscache in the normal case.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2018-11-23 09:59:48 +01:00
Jeff Hostetler
73260d0333 fscache: make fscache_enabled() public
Make fscache_enabled() function public rather than static.
Remove unneeded fscache_is_enabled() function.
Change is_fscache_enabled() macro to call fscache_enabled().

is_fscache_enabled() now takes a pathname so that the answer
is more precise and mean "is fscache enabled for this pathname",
since fscache only stores repo-relative paths and not absolute
paths, we can avoid attempting lookups for absolute paths.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2018-11-23 09:59:48 +01:00
Jeff Hostetler
bc01f63c8b dir.c: make add_excludes aware of fscache during status
Teach read_directory_recursive() and add_excludes() to
be aware of optional fscache and avoid trying to open()
and fstat() non-existant ".gitignore" files in every
directory in the worktree.

The current code in add_excludes() calls open() and then
fstat() for a ".gitignore" file in each directory present
in the worktree.  Change that when fscache is enabled to
call lstat() first and if present, call open().

This seems backwards because both lstat needs to do more
work than fstat.  But when fscache is enabled, fscache will
already know if the .gitignore file exists and can completely
avoid the IO calls.  This works because of the lstat diversion
to mingw_lstat when fscache is enabled.

This reduced status times on a 350K file enlistment of the
Windows repo on a NVMe SSD by 0.25 seconds.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2018-11-23 09:59:48 +01:00
Jeff Hostetler
5199d7562a add: use preload-index and fscache for performance
Teach "add" to use preload-index and fscache features
to improve performance on very large repositories.

During an "add", a call is made to run_diff_files()
which calls check_remove() for each index-entry.  This
calls lstat().  On Windows, the fscache code intercepts
the lstat() calls and builds a private cache using the
FindFirst/FindNext routines, which are much faster.

Somewhat independent of this, is the preload-index code
which distributes some of the start-up costs across
multiple threads.

We need to keep the call to read_cache() before parsing the
pathspecs (and hence cannot use the pathspecs to limit any preload)
because parse_pathspec() is using the index to determine whether a
pathspec is, in fact, in a submodule. If we would not read the index
first, parse_pathspec() would not error out on a path that is inside
a submodule, and t7400-submodule-basic.sh would fail with

	not ok 47 - do not add files from a submodule

We still want the nice preload performance boost, though, so we simply
call read_cache_preload(&pathspecs) after parsing the pathspecs.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:47 +01:00
Johannes Schindelin
f08c1dee7e mingw: ensure that core.longPaths is handled *always*
A ton of Git commands simply do not read (or at least parse) the core.*
settings. This is not good, as Git for Windows relies on the
core.longPaths setting to be read quite early on.

So let's just make sure that all commands read the config and give
platform_core_config() a chance.

This patch teaches tons of Git commands to respect the config setting
`core.longPaths = true`, including `pack-refs`, thereby fixing
https://github.com/git-for-windows/git/issues/1218

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:47 +01:00
Johannes Schindelin
1f2fdf1b76 fscache: add a test for the dir-not-found optimization
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:47 +01:00
Jeff Hostetler
6f74bf8385 fscache: remember not-found directories
Teach FSCACHE to remember "not found" directories.

This is a performance optimization.

FSCACHE is a performance optimization available for Windows.  It
intercepts Posix-style lstat() calls into an in-memory directory
using FindFirst/FindNext.  It improves performance on Windows by
catching the first lstat() call in a directory, using FindFirst/
FindNext to read the list of files (and attribute data) for the
entire directory into the cache, and short-cut subsequent lstat()
calls in the same directory.  This gives a major performance
boost on Windows.

However, it does not remember "not found" directories.  When STATUS
runs and there are missing directories, the lstat() interception
fails to find the parent directory and simply return ENOENT for the
file -- it does not remember that the FindFirst on the directory
failed. Thus subsequent lstat() calls in the same directory, each
re-attempt the FindFirst.  This completely defeats any performance
gains.

This can be seen by doing a sparse-checkout on a large repo and
then doing a read-tree to reset the skip-worktree bits and then
running status.

This change reduced status times for my very large repo by 60%.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:47 +01:00
Jeff Hostetler
ed87b2e3fb fscache: add key for GIT_TRACE_FSCACHE
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:47 +01:00
Johannes Schindelin
780c1fc25b mingw: support spawning programs containing spaces in their names
The CreateProcessW() function does not really support spaces in its
first argument, lpApplicationName. But it supports passing NULL as
lpApplicationName, which makes it figure out the application from the
(possibly quoted) first argument of lpCommandLine.

Let's use that trick (if we are certain that the first argument matches
the executable's path) to support launching programs whose path contains
spaces.

This fixes https://github.com/git-for-windows/git/issue/692

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:46 +01:00
Bert Belder
ce71faa2e6 Win32: symlink: add test for symlink attribute
Signed-off-by: Bert Belder <bertbelder@gmail.com>
2018-11-23 09:59:45 +01:00
Johannes Schindelin
1cca6d7282 mingw: try to create symlinks without elevated permissions
With Windows 10 Build 14972 in Developer Mode, a new flag is supported
by CreateSymbolicLink() to create symbolic links even when running
outside of an elevated session (which was previously required).

This new flag is called SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE and
has the numeric value 0x02.

Previous Windows 10 versions will not understand that flag and return an
ERROR_INVALID_PARAMETER, therefore we have to be careful to try passing
that flag only when the build number indicates that it is supported.

For more information about the new flag, see this blog post:
https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/

This patch is loosely based on the patch submitted by Samuel D. Leslie
as https://github.com/git-for-windows/git/pull/1184.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:45 +01:00
Bert Belder
0edf72836c Win32: symlink: specify symlink type in .gitattributes
On Windows, symbolic links have a type: a "file symlink" must point at
a file, and a "directory symlink" must point at a directory. If the
type of symlink does not match its target, it doesn't work.

Git does not record the type of symlink in the index or in a tree. On
checkout it'll guess the type, which only works if the target exists
at the time the symlink is created. This may often not be the case,
for example when the link points at a directory inside a submodule.

By specifying `symlink=file` or `symlink=dir` the user can specify what
type of symlink Git should create, so Git doesn't have to rely on
unreliable heuristics.

Signed-off-by: Bert Belder <bertbelder@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:45 +01:00
Karsten Blees
2e05a94635 Win32: symlink: add support for symlinks to directories
Symlinks on Windows have a flag that indicates whether the target is a file
or a directory. Symlinks of wrong type simply don't work. This even affects
core Win32 APIs (e.g. DeleteFile() refuses to delete directory symlinks).

However, CreateFile() with FILE_FLAG_BACKUP_SEMANTICS doesn't seem to care.
Check the target type by first creating a tentative file symlink, opening
it, and checking the type of the resulting handle. If it is a directory,
recreate the symlink with the directory flag set.

It is possible to create symlinks before the target exists (or in case of
symlinks to symlinks: before the target type is known). If this happens,
create a tentative file symlink and postpone the directory decision: keep
a list of phantom symlinks to be processed whenever a new directory is
created in mingw_mkdir().

Limitations: This algorithm may fail if a link target changes from file to
directory or vice versa, or if the target directory is created in another
process.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:45 +01:00
Bert Belder
4a14079dc9 Win32: symlink: move phantom symlink creation to a separate function
Signed-off-by: Bert Belder <bertbelder@gmail.com>
2018-11-23 09:59:45 +01:00
Karsten Blees
7ea25e3a42 Win32: implement basic symlink() functionality (file symlinks only)
Implement symlink() that always creates file symlinks. Fails with ENOSYS
if symlinks are disabled or unsupported.

Note: CreateSymbolicLinkW() was introduced with symlink support in Windows
Vista. For compatibility with Windows XP, we need to load it dynamically
and fail gracefully if it isnt's available.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:44 +01:00
Karsten Blees
b5703fd81b Win32: implement readlink()
Implement readlink() by reading NTFS reparse points. Works for symlinks
and directory junctions. If symlinks are disabled, fail with ENOSYS.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:44 +01:00
Karsten Blees
53b198530c Win32: mingw_chdir: change to symlink-resolved directory
If symlinks are enabled, resolve all symlinks when changing directories,
as required by POSIX.

Note: Git's real_path() function bases its link resolution algorithm on
this property of chdir(). Unfortunately, the current directory on Windows
is limited to only MAX_PATH (260) characters. Therefore using symlinks and
long paths in combination may be problematic.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:44 +01:00
Karsten Blees
f0ff5f6c52 Win32: mingw_rename: support renaming symlinks
MSVCRT's _wrename() cannot rename symlinks over existing files: it returns
success without doing anything. Newer MSVCR*.dll versions probably do not
have this problem: according to CRT sources, they just call MoveFileEx()
with the MOVEFILE_COPY_ALLOWED flag.

Get rid of _wrename() and call MoveFileEx() with proper error handling.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:44 +01:00
Karsten Blees
c712b8ea2a Win32: mingw_unlink: support symlinks to directories
_wunlink() / DeleteFileW() refuses to delete symlinks to directories. If
_wunlink() fails with ERROR_ACCESS_DENIED, try _wrmdir() as well.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:44 +01:00
Karsten Blees
dbdad95303 Win32: add symlink-specific error codes
Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:44 +01:00
Karsten Blees
14e887582d Win32: change default of 'core.symlinks' to false
Symlinks on Windows don't work the same way as on Unix systems. E.g. there
are different types of symlinks for directories and files, creating
symlinks requires administrative privileges etc.

By default, disable symlink support on Windows. I.e. users explicitly have
to enable it with 'git config [--system|--global] core.symlinks true'.

The test suite ignores system / global config files. Allow testing *with*
symlink support by checking if native symlinks are enabled in MSys2 (via
'MSYS=winsymlinks:nativestrict').

Reminder: This would need to be changed if / when we find a way to run the
test suite in a non-MSys-based shell (e.g. dash).

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:44 +01:00
Karsten Blees
707764b223 Win32: factor out retry logic
The retry pattern is duplicated in three places. It also seems to be too
hard to use: mingw_unlink() and mingw_rmdir() duplicate the code to retry,
and both of them do so incompletely. They also do not restore errno if the
user answers 'no'.

Introduce a retry_ask_yes_no() helper function that handles retry with
small delay, asking the user, and restoring errno.

mingw_unlink: include _wchmod in the retry loop (which may fail if the
file is locked exclusively).

mingw_rmdir: include special error handling in the retry loop.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:44 +01:00
Karsten Blees
19ec86c5c1 Win32: lstat(): return adequate stat.st_size for symlinks
Git typically doesn't trust the stat.st_size member of symlinks (e.g. see
strbuf_readlink()). However, some functions take shortcuts if st_size is 0
(e.g. diff_populate_filespec()).

In mingw_lstat() and fscache_lstat(), make sure to return an adequate size.

The extra overhead of opening and reading the reparse point to calculate
the exact size is not necessary, as git doesn't rely on the value anyway.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:44 +01:00
Karsten Blees
f8bce04bfe Win32: teach fscache and dirent about symlinks
Move S_IFLNK detection to file_attr_to_st_mode() and reuse it in fscache.

Implement DT_LNK detection in dirent.c and the fscache readdir version.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:44 +01:00
Johannes Schindelin
99580bd445 Merge branch 'maybe-drop'
These patches are probably no longer necessary. Make sure before
dropping them, though.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:43 +01:00
Karsten Blees
97ce70b56b Win32: let mingw_lstat() error early upon problems with reparse points
When obtaining lstat information for reparse points, we need to call
FindFirstFile() in addition to GetFileInformationEx() to obtain the type
of the reparse point (symlink, mount point etc.). However, currently there
is no error handling whatsoever if FindFirstFile() fails.

Call FindFirstFile() before modifying the stat *buf output parameter and
error out if the call fails.

Note: The FindFirstFile() return value includes all the data that we get
from GetFileAttributesEx(), so we could replace GetFileAttributesEx() with
FindFirstFile(). We don't do that because GetFileAttributesEx() is about
twice as fast for single files. I.e. we only pay the extra cost of calling
FindFirstFile() in the rare case that we encounter a reparse point.

Note: The indentation of the remaining reparse point code will be fixed in
the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:43 +01:00
Karsten Blees
4ca87f2282 Win32: remove separate do_lstat() function
With the new mingw_stat() implementation, do_lstat() is only called from
mingw_lstat() (with follow == 0). Remove the extra function and the old
mingw_stat()-specific (follow == 1) logic.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:43 +01:00
Karsten Blees
17c17319b5 Win32: implement stat() with symlink support
With respect to symlinks, the current stat() implementation is almost the
same as lstat(): except for the file type (st_mode & S_IFMT), it returns
information about the link rather than the target.

Implement stat by opening the file with as little permissions as possible
and calling GetFileInformationByHandle on it. This way, all link resoltion
is handled by the Windows file system layer.

If symlinks are disabled, use lstat() as before, but fail with ELOOP if a
symlink would have to be resolved.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-11-23 09:59:43 +01:00
Karsten Blees
86ed1cb292 Win32: don't call GetFileAttributes twice in mingw_lstat()
GetFileAttributes cannot handle paths with trailing dir separator. The
current [l]stat implementation calls GetFileAttributes twice if the path
has trailing slashes (first with the original path passed to [l]stat, and
and a second time with a path copy with trailing '/' removed).

With Unicode conversion, we get the length of the path for free and also
have a (wide char) buffer that can be modified.

Remove trailing directory separators before calling the Win32 API.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:43 +01:00
Karsten Blees
cc3671e7ce lockfile.c: use is_dir_sep() instead of hardcoded '/' checks
Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:43 +01:00
Karsten Blees
8df8c36319 strbuf_readlink: support link targets that exceed PATH_MAX
strbuf_readlink() refuses to read link targets that exceed PATH_MAX (even
if a sufficient size was specified by the caller).

As some platforms support longer paths, remove this restriction (similar
to strbuf_getcwd()).

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-11-23 09:59:43 +01:00