Commit Graph

72765 Commits

Author SHA1 Message Date
Johannes Schindelin
b4e95cd3c1 fscache: add a test for the dir-not-found optimization
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:10:02 +02:00
Jeff Hostetler
b2e4b7d267 fscache: remember not-found directories
Teach FSCACHE to remember "not found" directories.

This is a performance optimization.

FSCACHE is a performance optimization available for Windows.  It
intercepts Posix-style lstat() calls into an in-memory directory
using FindFirst/FindNext.  It improves performance on Windows by
catching the first lstat() call in a directory, using FindFirst/
FindNext to read the list of files (and attribute data) for the
entire directory into the cache, and short-cut subsequent lstat()
calls in the same directory.  This gives a major performance
boost on Windows.

However, it does not remember "not found" directories.  When STATUS
runs and there are missing directories, the lstat() interception
fails to find the parent directory and simply return ENOENT for the
file -- it does not remember that the FindFirst on the directory
failed. Thus subsequent lstat() calls in the same directory, each
re-attempt the FindFirst.  This completely defeats any performance
gains.

This can be seen by doing a sparse-checkout on a large repo and
then doing a read-tree to reset the skip-worktree bits and then
running status.

This change reduced status times for my very large repo by 60%.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:10:02 +02:00
Jeff Hostetler
adb535ed54 fscache: add key for GIT_TRACE_FSCACHE
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:10:02 +02:00
Johannes Schindelin
04e94ec52e Merge branch 'long-paths'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:10:00 +02:00
Karsten Blees
5f0a9618c6 Win32: fix 'lstat("dir/")' with long paths
Use a suffciently large buffer to strip the trailing slash.

Signed-off-by: Karsten Blees <blees@dcon.de>
2017-06-05 13:04:05 +02:00
Karsten Blees
df8cd86950 Win32: support long paths
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).

Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.

Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).

Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).

Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).

Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.

Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.

While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.

Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199

Note that the test cannot rely on the presence of short names, as they
are not enabled by default except on the system drive.

[jes: adjusted test number to avoid conflicts, reinstated && chain,
adjusted test to work without short names]

Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:04:05 +02:00
Johannes Schindelin
d5d161aa8a Win32: support long paths
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).

Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.

Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).

Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).

Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).

Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.

Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.

While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.

Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199

[jes: adjusted test number to avoid conflicts]

Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:04:05 +02:00
Doug Kelly
4134adbb3f Add a test demonstrating a problem with long submodule paths
[jes: adusted test number to avoid conflicts, fixed non-portable use of
the 'export' statement]

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:04:05 +02:00
Karsten Blees
443ab9b519 fscache: load directories only once
If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.

On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.

Signed-off-by: Karsten Blees <blees@dcon.de>
2017-06-05 13:04:05 +02:00
Karsten Blees
207808e8a2 Win32: add a cache below mingw's lstat and dirent implementations
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.

Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.

Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.

The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.

Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.

Signed-off-by: Karsten Blees <blees@dcon.de>
2017-06-05 13:04:05 +02:00
Karsten Blees
9cfc6df37a add infrastructure for read-only file system level caches
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.

This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.

Enable read-only sections for 'git status' and preload_index.

Signed-off-by: Karsten Blees <blees@dcon.de>
2017-06-05 13:04:05 +02:00
Karsten Blees
f0e7ac1788 Win32: make the lstat implementation pluggable
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.

Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Signed-off-by: Karsten Blees <blees@dcon.de>
2017-06-05 13:04:05 +02:00
Karsten Blees
5699697f1a Win32: Make the dirent implementation pluggable
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.

Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.

Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).

Signed-off-by: Karsten Blees <blees@dcon.de>
2017-06-05 13:04:04 +02:00
Karsten Blees
d1184ac2c8 Win32: dirent.c: Move opendir down
Move opendir down in preparation for the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
2017-06-05 13:04:04 +02:00
Karsten Blees
9d41ddb4d3 Win32: make FILETIME conversion functions public
Signed-off-by: Karsten Blees <blees@dcon.de>
2017-06-05 13:04:04 +02:00
Johannes Schindelin
00aab60d50 mingw: unset PERL5LIB by default
Git for Windows ships with its own Perl interpreter, and insists on
using it, so it will most likely wreak havoc if PERL5LIB is set before
launching Git.

Let's just unset that environment variables when spawning processes.

To make this feature extensible (and overrideable), there is a new
config setting `core.unsetenvvars` that allows specifying a
comma-separated list of names to unset before spawning processes.

Reported by Gabriel Fuhrmann.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:04:04 +02:00
Johannes Schindelin
6d1df237f1 Move Windows-specific config settings into compat/mingw.c
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:04:04 +02:00
Johannes Schindelin
b7ddd98aa0 Allow for platform-specific core.* config settings
In the Git for Windows project, we have ample precendent for config
settings that apply to Windows, and to Windows only.

Let's formalize this concept by introducing a platform_core_config()
function that can be #define'd in a platform-specific manner.

This will allow us to contain platform-specific code better, as the
corresponding variables no longer need to be exported so that they can
be defined in environment.c and be set in config.c

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-06-05 13:04:04 +02:00
Johannes Schindelin
75b7c19e96 Start the merging-rebase to v2.13.1
This commit starts the rebase of e0df4975e3 to 0c6d50a75c
2017-06-05 13:03:59 +02:00
Junio C Hamano
2c04f63405 Git 2.13.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
v2.13.1
2017-06-05 09:05:38 +09:00
Junio C Hamano
478d3c35b0 Merge branch 'ah/doc-rev-parse-short-default' into maint
Doc update.

* ah/doc-rev-parse-short-default:
  doc: rewrite description for rev-parse --short
2017-06-05 09:03:23 +09:00
Junio C Hamano
f166aab8bf Merge branch 'ah/doc-filter-branch-export-env' into maint
Docfix.

* ah/doc-filter-branch-export-env:
  doc: filter-branch does not require re-export of vars
2017-06-05 09:03:22 +09:00
Junio C Hamano
e06b421cf0 Merge branch 'sd/t3200-typofix' into maint
Test fix.

* sd/t3200-typofix:
  branch test: fix invalid config key access
2017-06-05 09:03:22 +09:00
Junio C Hamano
2ed824bce5 Merge branch 'sb/t5531-update-desc' into maint
The description strings for a few tests have been updated.

* sb/t5531-update-desc:
  t5531: fix test description
2017-06-05 09:03:21 +09:00
Junio C Hamano
bba1c2b722 Merge branch 'ah/doc-pretty-format-fix' into maint
Documentation fix.

* ah/doc-pretty-format-fix:
  Documentation: fix formatting typo in pretty-formats.txt
2017-06-05 09:03:20 +09:00
Junio C Hamano
04dad2493b Merge branch 'ah/doc-interpret-trailers-ifexists' into maint
Documentation fix.

* ah/doc-interpret-trailers-ifexists:
  Documentation: fix reference to ifExists for interpret-trailers
2017-06-05 09:03:19 +09:00
Junio C Hamano
0d86bbdc11 Merge branch 'ab/ref-filter-no-contains' into maint
Doc update to a recent topic.

* ab/ref-filter-no-contains:
  tag: duplicate mention of --contains should mention --no-contains
2017-06-05 09:03:18 +09:00
Junio C Hamano
b8a4652d9b Merge branch 'sg/core-filemode-doc-typofix' into maint
* sg/core-filemode-doc-typofix:
  docs/config.txt: fix indefinite article in core.fileMode description
2017-06-05 09:03:17 +09:00
Junio C Hamano
86799c1c61 Merge branch 'tb/dedup-crlf-tests' into maint
* tb/dedup-crlf-tests:
  t0027: tests are not expensive; remove t0025
2017-06-05 09:03:16 +09:00
Junio C Hamano
9a73c958d8 Merge branch 'jn/credential-doc-on-clear' into maint
Doc update.

* jn/credential-doc-on-clear:
  credential doc: make multiple-helper behavior more prominent
2017-06-05 09:03:16 +09:00
Junio C Hamano
7dab7c5b59 Merge branch 'jk/url-insteadof-config' into maint
The interaction of "url.*.insteadOf" and custom URL scheme's
whitelisting is now documented better.

* jk/url-insteadof-config:
  docs/config: mention protocol implications of url.insteadOf
2017-06-05 09:03:15 +09:00
Junio C Hamano
f72e075e30 Merge branch 'jk/unbreak-am-h' into maint
"git am -h" triggered a BUG().

* jk/unbreak-am-h:
  am: handle "-h" argument earlier
2017-06-05 09:03:15 +09:00
Junio C Hamano
d0506fc419 Merge branch 'ab/sha1dc-maint' into maint
The "collision detecting" SHA-1 implementation shipped with 2.13
was quite broken on some big-endian platforms and/or platforms that
do not like unaligned fetches.  Update to the upstream code which
has already fixed these issues.

* ab/sha1dc-maint:
  sha1dc: update from upstream
2017-06-05 09:03:15 +09:00
Junio C Hamano
e6f80aee29 Merge branch 'js/bs-is-a-dir-sep-on-windows' into maint
"foo\bar\baz" in "git fetch foo\bar\baz", even though there is no
slashes in it, cannot be a nickname for a remote on Windows, as
that is likely to be a pathname on a local filesystem.

* js/bs-is-a-dir-sep-on-windows:
  Windows: do not treat a path with backslashes as a remote's nick name
  mingw.h: permit arguments with side effects for is_dir_sep
2017-06-05 09:03:15 +09:00
Junio C Hamano
a07148db31 Merge branch 'jk/alternate-ref-optim' into maint
A test allowed both "git push" and "git receive-pack" on the other
end write their traces into the same file.  This is OK on platforms
that allows atomically appending to a file opened with O_APPEND,
but on other platforms led to a mangled output, causing
intermittent test failures.  This has been fixed by disabling
traces from "receive-pack" in the test.

* jk/alternate-ref-optim:
  t5400: avoid concurrent writes into a trace file
2017-06-05 09:03:14 +09:00
Junio C Hamano
00c0e40f02 Merge branch 'bm/interpret-trailers-cut-line-is-eom' into maint
"git interpret-trailers", when used as GIT_EDITOR for "git commit
-v", looked for and appended to a trailer block at the very end,
i.e. at the end of the "diff" output.  The command has been
corrected to pay attention to the cut-mark line "commit -v" adds to
the buffer---the real trailer block should appear just before it.

* bm/interpret-trailers-cut-line-is-eom:
  interpret-trailers: honor the cut line
2017-06-05 09:03:13 +09:00
Junio C Hamano
fd8567c990 Merge branch 'kn/ref-filter-branch-list' into maint
"git for-each-ref --format=..." with %(HEAD) in the format used to
resolve the HEAD symref as many times as it had processed refs,
which was wasteful, and "git branch" shared the same problem.

* kn/ref-filter-branch-list:
  ref-filter: resolve HEAD when parsing %(HEAD) atom
2017-06-05 09:03:13 +09:00
Junio C Hamano
a207ad7081 Merge branch 'rs/checkout-am-fix-unborn' into maint
A few codepaths in "checkout" and "am" working on an unborn branch
tried to access an uninitialized piece of memory.

* rs/checkout-am-fix-unborn:
  am: check return value of resolve_refdup before using hash
  checkout: check return value of resolve_refdup before using hash
2017-06-05 09:03:12 +09:00
Junio C Hamano
c8c3321633 Merge branch 'jn/clone-add-empty-config-from-command-line' into maint
"git clone --config var=val" is a way to populate the
per-repository configuration file of the new repository, but it did
not work well when val is an empty string.  This has been fixed.

* jn/clone-add-empty-config-from-command-line:
  clone: handle empty config values in -c
2017-06-05 09:03:11 +09:00
Junio C Hamano
b19174e2f3 Merge branch 'ab/c-translators-comment-style' into maint
Update the C style recommendation for notes for translators, as
recent versions of gettext tools can work with our style of
multi-line comments.

* ab/c-translators-comment-style:
  C style: use standard style for "TRANSLATORS" comments
2017-06-05 09:03:10 +09:00
Junio C Hamano
916a338754 Merge branch 'ls/travis-doc-asciidoctor' into maint
Travis CI gained a task to format the documentation with both
AsciiDoc and AsciiDoctor.

* ls/travis-doc-asciidoctor:
  travis-ci: check AsciiDoc/AsciiDoctor stderr output
  travis-ci: unset compiler for jobs that do not need one
  travis-ci: parallelize documentation build
  travis-ci: build documentation with AsciiDoc and Asciidoctor
2017-06-05 09:03:10 +09:00
Junio C Hamano
e215bd91bb Prepare for 2.13.1; more topics to follow
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-04 10:29:06 +09:00
Junio C Hamano
5ecbaaf101 Merge branch 'tg/stash-push-fixup' into maint
The shell completion script (in contrib/) learned "git stash" has
a new "push" subcommand.

* tg/stash-push-fixup:
  completion: add git stash push
2017-06-04 10:21:08 +09:00
Junio C Hamano
b522c33b45 Merge branch 'km/log-showsignature-doc' into maint
Doc update.

* km/log-showsignature-doc:
  config.txt: add an entry for log.showSignature
2017-06-04 10:21:07 +09:00
Junio C Hamano
e2ae5ec1f3 Merge branch 'jt/use-trailer-api-in-commands' into maint
"git cherry-pick" and other uses of the sequencer machinery
mishandled a trailer block whose last line is an incomplete line.
This has been fixed so that an additional sign-off etc. are added
after completing the existing incomplete line.

* jt/use-trailer-api-in-commands:
  sequencer: add newline before adding footers
2017-06-04 10:21:06 +09:00
Junio C Hamano
058d655f8f Merge branch 'jt/push-options-doc' into maint
The receive-pack program now makes sure that the push certificate
records the same set of push options used for pushing.

* jt/push-options-doc:
  receive-pack: verify push options in cert
  docs: correct receive.advertisePushOptions default
2017-06-04 10:21:05 +09:00
Junio C Hamano
34bbe2edd4 Merge branch 'js/plug-leaks' into maint
Fix memory leaks pointed out by Coverity (and people).

* js/plug-leaks: (26 commits)
  checkout: fix memory leak
  submodule_uses_worktrees(): plug memory leak
  show_worktree(): plug memory leak
  name-rev: avoid leaking memory in the `deref` case
  remote: plug memory leak in match_explicit()
  add_reflog_for_walk: avoid memory leak
  shallow: avoid memory leak
  line-log: avoid memory leak
  receive-pack: plug memory leak in update()
  fast-export: avoid leaking memory in handle_tag()
  mktree: plug memory leaks reported by Coverity
  pack-redundant: plug memory leak
  setup_discovered_git_dir(): plug memory leak
  setup_bare_git_dir(): help static analysis
  split_commit_in_progress(): simplify & fix memory leak
  checkout: fix memory leak
  cat-file: fix memory leak
  mailinfo & mailsplit: check for EOF while parsing
  status: close file descriptor after reading git-rebase-todo
  difftool: address a couple of resource/memory leaks
  ...
2017-06-04 10:21:04 +09:00
Junio C Hamano
7ba4fa5c08 Merge branch 'js/eol-on-ourselves' into maint
Make sure our tests would pass when the sources are checked out
with "platform native" line ending convention by default on
Windows.  Some "text" files out tests use and the test scripts
themselves that are meant to be run with /bin/sh, ought to be
checked out with eol=LF even on Windows.

* js/eol-on-ourselves:
  t4051: mark supporting files as requiring LF-only line endings
  Fix the remaining tests that failed with core.autocrlf=true
  t3901: move supporting files into t/t3901/
  completion: mark bash script as LF-only
  git-new-workdir: mark script as LF-only
  Fix build with core.autocrlf=true
2017-06-04 10:21:04 +09:00
Junio C Hamano
970fb22dd6 Merge branch 'jk/update-links-in-docs' into maint
A few http:// links that are redirected to https:// in the
documentation have been updated to https:// links.

* jk/update-links-in-docs:
  doc: use https links to Wikipedia to avoid http redirects
2017-06-04 10:21:04 +09:00
Junio C Hamano
8d958b97c6 Merge branch 'jk/ignore-broken-tags-when-ignoring-missing-links' into maint
Tag objects, which are not reachable from any ref, that point at
missing objects were mishandled by "git gc" and friends (they
should silently be ignored instead)

* jk/ignore-broken-tags-when-ignoring-missing-links:
  revision.c: ignore broken tags with ignore_missing_links
2017-06-04 10:21:03 +09:00