Commit Graph

93622 Commits

Author SHA1 Message Date
Johannes Schindelin
a72b117355 Merge branch 'fsync-object-files-always'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:11:17 +01:00
Johannes Schindelin
f371bb20f1 Merge pull request #1354 from dscho/phase-out-show-ignored-directory-gracefully
Phase out `--show-ignored-directory` gracefully
2018-12-09 11:11:15 +01:00
Johannes Schindelin
9bbf3ff904 Merge branch 'status-no-lock-index'
This branch allows third-party tools to call `git status
--no-lock-index` to avoid lock contention with the interactive Git usage
of the actual human user.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:11:13 +01:00
Johannes Schindelin
6b692508c0 Merge 'docker-volumes-are-no-symlinks'
This was pull request #1645 from ZCube/master

Support windows container.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:11:11 +01:00
Johannes Schindelin
4a5f17f1d6 Merge pull request #1827 from benpeart/fscache_refresh_index
Enable the filesystem cache (fscache) in refresh_index().
2018-12-09 11:11:09 +01:00
Johannes Schindelin
9926f51bf9 Merge pull request #1468 from atetubou/fscache_checkout_flush
checkout.c: enable fscache for checkout again

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:11:08 +01:00
Johannes Schindelin
ba3dd8cd2d Merge pull request #1426 from atetubou/fetch_pack
fetch-pack.c: enable fscache for stats under .git/objects
2018-12-09 11:11:06 +01:00
Johannes Schindelin
cd673f0a1e Merge pull request #1344 from jeffhostetler/perf_add_excludes_with_fscache
dir.c: make add_excludes aware of fscache during status
2018-12-09 11:11:05 +01:00
Johannes Schindelin
968bdfdd91 Merge pull request #971 from jeffhostetler/jeffhostetler/add_preload_fscache
add: use preload-index and fscache for performance
2018-12-09 11:11:04 +01:00
Johannes Schindelin
4b53397879 Merge branch 'core-longpaths-everywhere'
Git for Windows supports the core.longPaths config setting to allow
writing/reading long paths via the \\?\ trick for a long time now.

However, for that support to work, it is absolutely necessary that
git_default_config() is given a chance to parse the config. Otherwise
Git will be non the wiser.

So let's make sure that as many commands that previously failed to
parse the core.* settings now do that, implicitly enabling long path
support in a lot more places.

Note: this is not a perfect solution, and it cannot be, as there is
a chicken-and-egg problem in reading the config itself...

This fixes https://github.com/git-for-windows/git/issues/1218

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:11:02 +01:00
Johannes Schindelin
4e5cb14ce8 Merge pull request #994 from jeffhostetler/jeffhostetler/fscache_nfd
fscache: add not-found directory cache to fscache
2018-12-09 11:10:58 +01:00
Johannes Schindelin
31d8440c1b Merge branch 'spawn-with-spaces'
This change lets us spawn .bat scripts whose paths contain spaces.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:10:57 +01:00
Johannes Schindelin
a2975afd23 Merge branch 'clean-long-paths'
This addresses https://github.com/git-for-windows/git/issues/521

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:10:55 +01:00
Johannes Schindelin
06929909cf Merge pull request #305 from dscho/msysgit_issues_182
Allow `add -p` and `add -i` with a large number of files
2018-12-09 11:10:54 +01:00
Johannes Schindelin
592b4a9cd4 Merge branch 'program-data-config'
This branch introduces support for reading the "Windows-wide" Git
configuration from `%PROGRAMDATA%\Git\config`. As these settings are
intended to be shared between *all* Git-related software, that config
file takes an even lower precedence than `$(prefix)/etc/gitconfig`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:10:52 +01:00
Johannes Schindelin
36adb85bf3 Merge pull request #1897 from piscisaureus/symlink-attr
Specify symlink type in .gitattributes
2018-12-09 11:10:51 +01:00
Johannes Schindelin
405b6a8b3e Merge branch 'kblees/kb/symlinks' 2018-12-09 11:10:49 +01:00
Johannes Schindelin
8102053c3d Merge branch 'msys2' 2018-12-09 11:10:47 +01:00
Johannes Schindelin
dc2d604bac Merge branch 'long-paths' 2018-12-09 11:10:45 +01:00
Johannes Schindelin
a4acd7cf39 Merge branch 'fscache' 2018-12-09 11:10:43 +01:00
Jeff Hostetler
f0a14e71f1 Merge branch 'visual-studio'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:10:41 +01:00
Jeff Hostetler
1085e81e36 Merge branch 'msvc'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:10:39 +01:00
Johannes Schindelin
e9113df7cd Merge branch 'gitk-and-git-gui-patches'
These are Git for Windows' Git GUI and gitk patches. We will have to
decide at some point what to do about them, but that's a little lower
priority (as Git GUI seems to be unmaintained for the time being, and
the gitk maintainer keeps a very low profile on the Git mailing list,
too).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:10:36 +01:00
Johannes Schindelin
6db792c780 Merge branch 'ready-for-upstream'
This is the branch thicket of patches in Git for Windows that are
considered ready for upstream. To keep them in a ready-to-submit shape,
they are kept as close to the beginning of the branch thicket as
possible.
2018-12-09 11:10:34 +01:00
Johannes Schindelin
fcb2f9d6c0 mingw: change core.fsyncObjectFiles = 1 by default
From the documentation of said setting:

	This boolean will enable fsync() when writing object files.

	This is a total waste of time and effort on a filesystem that
	orders data writes properly, but can be useful for filesystems
	that do not use journalling (traditional UNIX filesystems) or
	that only journal metadata and not file contents (OS X’s HFS+,
	or Linux ext3 with "data=writeback").

The most common file system on Windows (NTFS) does not guarantee that
order, therefore a sudden loss of power (or any other event causing an
unclean shutdown) would cause corrupt files (i.e. files filled with
NULs). Therefore we need to change the default.

Note that the documentation makes it sound as if this causes really bad
performance. In reality, writing loose objects is something that is done
only rarely, and only a handful of files at a time.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:09:12 +01:00
Johannes Schindelin
cf9bef92de status: verify that --show-ignored-directory prints a warning
The option is deprecated now, and we better make sure that keeps saying
so until we finally remove it.

Suggested by Kevin Willford.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:09:09 +01:00
Johannes Schindelin
66053062fe status: reinstate --show-ignored-directory as a deprecated option
It was a bad idea to just remove that option from Git for Windows
v2.15.0, as early users of that (still experimental) option would have
been puzzled what they are supposed to do now.

So let's reintroduce the flag, but make sure to show the user good
advice how to fix this going forward.

We'll remove this option in a more orderly fashion either in v2.16.0 or
in v2.17.0.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:09:08 +01:00
Johannes Schindelin
9cf38f3788 status: carry the --no-lock-index option for backwards-compatibility
When a third-party tool periodically runs `git status` in order to keep
track of the state of the working tree, it is a bad idea to lock the
index: it might interfere with interactive commands executed by the
user, e.g. when the user wants to commit files.

Git for Windows introduced the `--no-lock-index` option a long time ago
to fix that (it made it into Git for Windows v2.9.2(3)) by simply
avoiding to write that file.

The downside is that the periodic `git status` calls will be a little
bit more wasteful because they may have to refresh the index repeatedly,
only to throw away the updates when it exits. This cannot really be
helped, though, as tools wanting to get a periodic update of the status
have no way to predict when the user may want to lock the index herself.

Sadly, a competing approach was submitted (by somebody who apparently
has less work on their plate than this maintainer) that made it into
v2.15.0 but is *different*: instead of a `git status`-only option, it is
an option that comes *before* the Git command and is called differently,
too.

Let's give previous users a chance to upgrade to newer Git for Windows
versions by handling the `--no-lock-index` option, still, though with a
big fat warning.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:09:07 +01:00
Johannes Schindelin
4cde96bd64 mingw: Windows Docker volumes are *not* symbolic links
... even if they may look like them.

As looking up the target of the "symbolic link" (just to see whether it
starts with `/ContainerMappedDirectories/`) is pretty expensive, we
do it when we can be *really* sure that there is a possibility that this
might be the case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: JiSeop Moon <zcube@zcube.kr>
2018-12-09 11:09:03 +01:00
JiSeop Moon
7a473049f1 mingw: move the file_attr_to_st_mode() function definition
In preparation for making this function a bit more complicated (to allow
for special-casing the `ContainerMappedDirectories` in Windows
containers, which look like a symbolic link, but are not), let's move it
out of the header.

Signed-off-by: JiSeop Moon <zcube@zcube.kr>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:09:03 +01:00
JiSeop Moon
c526207b40 mingw: when running in a Windows container, try to rename() harder
It is a known issue that a rename() can fail with an "Access denied"
error at times, when copying followed by deleting the original file
works. Let's just fall back to that behavior.

Signed-off-by: JiSeop Moon <zcube@zcube.kr>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:09:02 +01:00
JiSeop Moon
e811cb52bf mingw: introduce code to detect whether we're inside a Windows container
This will come in handy in the next commit.

Signed-off-by: JiSeop Moon <zcube@zcube.kr>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:09:01 +01:00
Johannes Schindelin
b944e770e4 Merge branch 'program-data-config'
This branch introduces support for reading the "Windows-wide" Git
configuration from `%PROGRAMDATA%\Git\config`. As these settings are
intended to be shared between *all* Git-related software, that config
file takes an even lower precedence than `$(prefix)/etc/gitconfig`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:09:01 +01:00
Ben Peart
a7d5d11ce0 Enable the filesystem cache (fscache) in refresh_index().
On file systems that support it, this can dramatically speed up operations
like add, commit, describe, rebase, reset, rm that would otherwise have to
lstat() every file to "re-match" the stat information in the index to that
of the file system.

On a synthetic repo with 1M files, "git reset" dropped from 52.02 seconds to
14.42 seconds for a savings of 72%.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
2018-12-09 11:08:58 +01:00
Takuto Ikuta
b287ad8a90 checkout.c: enable fscache for checkout again
This is retry of #1419.

I added flush_fscache macro to flush cached stats after disk writing
with tests for regression reported in #1438 and #1442.

git checkout checks each file path in sorted order, so cache flushing does not
make performance worse unless we have large number of modified files in
a directory containing many files.

Using chromium repository, I tested `git checkout .` performance when I
delete 10 files in different directories.
With this patch:
TotalSeconds: 4.307272
TotalSeconds: 4.4863595
TotalSeconds: 4.2975562
Avg: 4.36372923333333

Without this patch:
TotalSeconds: 20.9705431
TotalSeconds: 22.4867685
TotalSeconds: 18.8968292
Avg: 20.7847136

I confirmed this patch passed all tests in t/ with core_fscache=1.

Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
2018-12-09 11:08:57 +01:00
Takuto Ikuta
f89b6201af fetch-pack.c: enable fscache for stats under .git/objects
When I do git fetch, git call file stats under .git/objects for each
refs. This takes time when there are many refs.

By enabling fscache, git takes file stats by directory traversing and that
improved the speed of fetch-pack for repository having large number of
refs.

In my windows workstation, this improves the time of `git fetch` for
chromium repository like below. I took stats 3 times.

* With this patch
TotalSeconds: 9.9825165
TotalSeconds: 9.1862075
TotalSeconds: 10.1956256
Avg: 9.78811653333333

* Without this patch
TotalSeconds: 15.8406702
TotalSeconds: 15.6248053
TotalSeconds: 15.2085938
Avg: 15.5580231

Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
2018-12-09 11:08:56 +01:00
Jeff Hostetler
cedc1e0d13 dir.c: regression fix for add_excludes with fscache
Fix regression described in:
https://github.com/git-for-windows/git/issues/1392

which was introduced in:
b2353379bb

Problem Symptoms
================
When the user has a .gitignore file that is a symlink, the fscache
optimization introduced above caused the stat-data from the symlink,
rather that of the target file, to be returned.  Later when the ignore
file was read, the buffer length did not match the stat.st_size field
and we called die("cannot use <path> as an exclude file")

Optimization Rationale
======================
The above optimization calls lstat() before open() primarily to ask
fscache if the file exists.  It gets the current stat-data as a side
effect essentially for free (since we already have it in memory).
If the file does not exist, it does not need to call open().  And
since very few directories have .gitignore files, we can greatly
reduce time spent in the filesystem.

Discussion of Fix
=================
The above optimization calls lstat() rather than stat() because the
fscache only intercepts lstat() calls.  Calls to stat() stay directed
to the mingw_stat() completly bypassing fscache.  Furthermore, calls
to mingw_stat() always call {open, fstat, close} so that symlinks are
properly dereferenced, which adds *additional* open/close calls on top
of what the original code in dir.c is doing.

Since the problem only manifests for symlinks, we add code to overwrite
the stat-data when the path is a symlink.  This preserves the effect of
the performance gains provided by the fscache in the normal case.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2018-12-09 11:08:55 +01:00
Jeff Hostetler
c12ef2fafb fscache: make fscache_enabled() public
Make fscache_enabled() function public rather than static.
Remove unneeded fscache_is_enabled() function.
Change is_fscache_enabled() macro to call fscache_enabled().

is_fscache_enabled() now takes a pathname so that the answer
is more precise and mean "is fscache enabled for this pathname",
since fscache only stores repo-relative paths and not absolute
paths, we can avoid attempting lookups for absolute paths.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2018-12-09 11:08:54 +01:00
Jeff Hostetler
eaf395627c add: use preload-index and fscache for performance
Teach "add" to use preload-index and fscache features
to improve performance on very large repositories.

During an "add", a call is made to run_diff_files()
which calls check_remove() for each index-entry.  This
calls lstat().  On Windows, the fscache code intercepts
the lstat() calls and builds a private cache using the
FindFirst/FindNext routines, which are much faster.

Somewhat independent of this, is the preload-index code
which distributes some of the start-up costs across
multiple threads.

We need to keep the call to read_cache() before parsing the
pathspecs (and hence cannot use the pathspecs to limit any preload)
because parse_pathspec() is using the index to determine whether a
pathspec is, in fact, in a submodule. If we would not read the index
first, parse_pathspec() would not error out on a path that is inside
a submodule, and t7400-submodule-basic.sh would fail with

	not ok 47 - do not add files from a submodule

We still want the nice preload performance boost, though, so we simply
call read_cache_preload(&pathspecs) after parsing the pathspecs.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:53 +01:00
Jeff Hostetler
d14e5aa7dc dir.c: make add_excludes aware of fscache during status
Teach read_directory_recursive() and add_excludes() to
be aware of optional fscache and avoid trying to open()
and fstat() non-existant ".gitignore" files in every
directory in the worktree.

The current code in add_excludes() calls open() and then
fstat() for a ".gitignore" file in each directory present
in the worktree.  Change that when fscache is enabled to
call lstat() first and if present, call open().

This seems backwards because both lstat needs to do more
work than fstat.  But when fscache is enabled, fscache will
already know if the .gitignore file exists and can completely
avoid the IO calls.  This works because of the lstat diversion
to mingw_lstat when fscache is enabled.

This reduced status times on a 350K file enlistment of the
Windows repo on a NVMe SSD by 0.25 seconds.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2018-12-09 11:08:53 +01:00
Johannes Schindelin
9c5559bf65 mingw: ensure that core.longPaths is handled *always*
A ton of Git commands simply do not read (or at least parse) the core.*
settings. This is not good, as Git for Windows relies on the
core.longPaths setting to be read quite early on.

So let's just make sure that all commands read the config and give
platform_core_config() a chance.

This patch teaches tons of Git commands to respect the config setting
`core.longPaths = true`, including `pack-refs`, thereby fixing
https://github.com/git-for-windows/git/issues/1218

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:50 +01:00
Johannes Schindelin
8190caabcb fscache: add a test for the dir-not-found optimization
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:49 +01:00
Jeff Hostetler
530b8ebf0a fscache: remember not-found directories
Teach FSCACHE to remember "not found" directories.

This is a performance optimization.

FSCACHE is a performance optimization available for Windows.  It
intercepts Posix-style lstat() calls into an in-memory directory
using FindFirst/FindNext.  It improves performance on Windows by
catching the first lstat() call in a directory, using FindFirst/
FindNext to read the list of files (and attribute data) for the
entire directory into the cache, and short-cut subsequent lstat()
calls in the same directory.  This gives a major performance
boost on Windows.

However, it does not remember "not found" directories.  When STATUS
runs and there are missing directories, the lstat() interception
fails to find the parent directory and simply return ENOENT for the
file -- it does not remember that the FindFirst on the directory
failed. Thus subsequent lstat() calls in the same directory, each
re-attempt the FindFirst.  This completely defeats any performance
gains.

This can be seen by doing a sparse-checkout on a large repo and
then doing a read-tree to reset the skip-worktree bits and then
running status.

This change reduced status times for my very large repo by 60%.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:48 +01:00
Jeff Hostetler
ec052601fc fscache: add key for GIT_TRACE_FSCACHE
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:48 +01:00
Johannes Schindelin
62915ce200 mingw: support spawning programs containing spaces in their names
The CreateProcessW() function does not really support spaces in its
first argument, lpApplicationName. But it supports passing NULL as
lpApplicationName, which makes it figure out the application from the
(possibly quoted) first argument of lpCommandLine.

Let's use that trick (if we are certain that the first argument matches
the executable's path) to support launching programs whose path contains
spaces.

This fixes https://github.com/git-for-windows/git/issue/692

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:47 +01:00
Johannes Schindelin
9a7ddc0ee4 t7300: git clean -dfx must show an error with long paths
In particular on Windows, where the default maximum path length is quite
small, but there are ways to circumvent that limit in many cases, it is
very important that users be given an indication why their command
failed because of too long paths when it did.

This test case makes sure that a warning is issued that would have
helped the user who reported Git for Windows' issue 521:

	https://github.com/git-for-windows/git/issues/521

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:45 +01:00
Johannes Schindelin
5560165183 t3701: verify that we can add *lots* of files interactively
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:44 +01:00
Johannes Schindelin
f8175d2fdf remove_dirs: do not swallow error when stat() failed
Without an error message when stat() failed, e.g. `git clean` would
abort without an error message, leaving the user quite puzzled.

This fixes https://github.com/git-for-windows/git/issues/521

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:44 +01:00
Kelly Heller
9108cef54f Allow add -p and add -i with a large number of files
This fixes https://github.com/msysgit/git/issues/182.

Inspired by Pull Request 218 using code from @PhilipDavis.

[jes: simplified code quite a bit]

Signed-off-by: Kelly Heller <kkheller@cedrus.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:43 +01:00
Johannes Schindelin
0e3d00de20 Windows: add support for a Windows-wide configuration
Between the libgit2 and the Git for Windows project, there has been a
discussion how we could share Git configuration to avoid duplication (or
worse: skew).

Earlier, libgit2 was nice enough to just re-use Git for Windows'

	C:\Program Files (x86)\Git\etc\gitconfig

but with the upcoming Git for Windows 2.x, there would be more paths to
search, as we will have 64-bit and 32-bit versions, and the
corresponding config files will be in %PROGRAMFILES%\Git\mingw64\etc and
...\mingw32\etc, respectively.

Worse: there are portable Git for Windows versions out there which live
in totally unrelated directories, still.

Therefore we came to a consensus to use `%PROGRAMDATA%\Git\config` as the
location for shared Git settings that are of wider interest than just Git
for Windows.

Of course, the configuration in `%PROGRAMDATA%\Git\config` has the
widest reach, therefore it must take the lowest precedence, i.e. Git for
Windows can still override settings in its `etc/gitconfig` file.

Helped-by: Andreas Heiduk <asheiduk@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-12-09 11:08:42 +01:00