Commit Graph

44813 Commits

Author SHA1 Message Date
Karsten Blees
fce8d2cb21 Win32: support long paths
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).

Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.

Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).

Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).

Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).

Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.

Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.

While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.

Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
2014-04-10 13:53:47 -05:00
Doug Kelly
44585c0e74 Add a test demonstrating a problem with long submodule paths
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-04-10 13:53:47 -05:00
Johannes Schindelin
08835fecef Merge remote-tracking branch 'kblees/kb/fscache-v4-tentative-1.8.5' into thicket-1.8.5.2 2014-04-10 13:53:45 -05:00
Johannes Schindelin
3ded66a9bf Merge 'poll-busy-wait' into HEAD 2014-04-10 13:53:45 -05:00
Johannes Schindelin
677413397c Merge 'normalize-win-paths' into HEAD 2014-04-10 13:53:45 -05:00
Johannes Schindelin
7ee44c5dfd Merge 'msvc-link-crt' into HEAD 2014-04-10 13:53:44 -05:00
Johannes Schindelin
5f6e8a9316 Merge 'install-wincred' into HEAD 2014-04-10 13:53:44 -05:00
Johannes Schindelin
5544bd1339 Merge 'fix-is-exe' into HEAD 2014-04-10 13:53:44 -05:00
Johannes Schindelin
108c844f57 Merge 'fix-externals' into HEAD 2014-04-10 13:53:44 -05:00
Johannes Schindelin
4034092fde Merge 'stash-reflog' into HEAD 2014-04-10 13:53:44 -05:00
Johannes Schindelin
9c90f99395 Merge 'http-msys-paths' into HEAD 2014-04-10 13:53:44 -05:00
Johannes Schindelin
746f6f41b7 Merge 'remote-hg-prerequisites' into HEAD
These fixes were necessary for Sverre Rabbelier's remote-hg to work,
but for some magic reason they are not necessary for the current
remote-hg. Makes you wonder how that one gets away with it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-04-10 13:53:43 -05:00
Johannes Schindelin
616c7ba9be Merge 'win-tests-fixes' into HEAD 2014-04-10 13:53:43 -05:00
Johannes Schindelin
85efa6f012 Merge 'grep-fixes' into HEAD 2014-04-10 13:53:43 -05:00
Johannes Schindelin
98cc090b7d Merge 'pull-rebase-interactive' into HEAD 2014-04-10 13:53:43 -05:00
Johannes Schindelin
94899e014a Merge 'send-email' into HEAD 2014-04-10 13:53:43 -05:00
Johannes Schindelin
7051c657c4 Merge 'jberezanski/wincred-sso-r2' into HEAD 2014-04-10 13:53:42 -05:00
Johannes Schindelin
d5fe20671a Merge 'gitweb-syntax' into HEAD 2014-04-10 13:53:42 -05:00
Johannes Schindelin
7e977b24e8 Merge 'gitk' into HEAD 2014-04-10 13:53:42 -05:00
Johannes Schindelin
2581b787a1 Merge 'git-gui' into HEAD 2014-04-10 13:53:42 -05:00
Johannes Schindelin
d10e8eccd6 Merge 'deny-current-branch' into HEAD 2014-04-10 13:53:42 -05:00
Johannes Schindelin
c808197015 Merge 'criss-cross-merge' into HEAD 2014-04-10 13:53:41 -05:00
Johannes Schindelin
dbda31249f Merge 'am-submodules' into HEAD 2014-04-10 13:53:41 -05:00
Johannes Schindelin
063e2dbf79 Merge 'unc' into HEAD 2014-04-10 13:53:41 -05:00
Johannes Schindelin
f4a12d6d73 Merge 'home' into HEAD 2014-04-10 13:53:41 -05:00
Johannes Schindelin
37f6f9eafc Merge 'hide-dotgit' into HEAD 2014-04-10 13:53:41 -05:00
Johannes Schindelin
58846d4909 Merge 'unicode' into HEAD 2014-04-10 13:53:41 -05:00
Karsten Blees
0460992244 Win32: add a cache below mingw's lstat and dirent implementations
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.

Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.

Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.

The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.

Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-04-10 13:53:40 -05:00
Karsten Blees
aab5bc18ec add infrastructure for read-only file system level caches
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.

This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.

Enable read-only sections for 'git status' and preload_index.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-04-10 13:53:40 -05:00
Johannes Schindelin
bc15c2c58e Merge 'unc' into HEAD 2014-04-10 13:53:40 -05:00
Karsten Blees
c8cd1c267c Win32: make the lstat implementation pluggable
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.

Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-04-10 13:53:40 -05:00
Karsten Blees
b4dd47ac1b Win32: Make the dirent implementation pluggable
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.

Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.

Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-04-10 13:53:39 -05:00
Karsten Blees
47820e8158 Win32: dirent.c: Move opendir down
Move opendir down in preparation for the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-04-10 13:53:39 -05:00
Karsten Blees
231e00f8d1 Win32: make FILETIME conversion functions public
Signed-off-by: Karsten Blees <blees@dcon.de>
2014-04-10 13:53:39 -05:00
Johannes Schindelin
5959ee1e04 Merge branch 'kb/hashmap-v5-minimal' into kb/fscache-v4-t1.8.5 2014-04-10 13:53:39 -05:00
theoleblond
94ac5c36e9 Sleep 1 millisecond in poll() to avoid busy wait
I played around with this quite a bit. After trying some more complex
schemes, I found that what worked best is to just sleep 1 millisecond
between iterations. Though it's a very short time, it still completely
eliminates the busy wait condition, without hurting perf.

There code uses SleepEx(1, TRUE) to sleep. See this page for a good
discussion of why that is better than calling SwitchToThread, which
is what was used previously:
http://stackoverflow.com/questions/1383943/switchtothread-vs-sleep1

Note that calling SleepEx(0, TRUE) does *not* solve the busy wait.

The most striking case was when testing on a UNC share with a large repo,
on a single CPU machine. Without the fix, it took 4 minutes 15 seconds,
and with the fix it took just 1:08! I think it's because git-upload-pack's
busy wait was eating the CPU away from the git process that's doing the
real work. With multi-proc, the timing is not much different, but tons of
CPU time is still wasted, which can be a killer on a server that needs to
do bunch of other things.

I also tested the very fast local case, and didn't see any measurable
difference. On a big repo with 4500 files, the upload-pack took about 2
seconds with and without the fix.
2014-04-10 13:53:38 -05:00
Karsten Blees
588b6503e6 add a hashtable implementation that supports O(1) removal
The existing hashtable implementation (in hash.[ch]) uses open addressing
(i.e. resolve hash collisions by distributing entries across the table).
Thus, removal is difficult to implement with less than O(n) complexity.
Resolving collisions of entries with identical hashes (e.g. via chaining)
is left to the client code.

Add a hashtable implementation that supports O(1) removal and is slightly
easier to use due to builtin entry chaining.

Supports all basic operations init, free, get, add, remove and iteration.

Also includes ready-to-use hash functions based on the public domain FNV-1
algorithm (http://www.isthe.com/chongo/tech/comp/fnv).

The per-entry data structure (hashmap_entry) is piggybacked in front of
the client's data structure to save memory. See test-hashmap.c for usage
examples.

The hashtable is resized by a factor of four when 80% full. With these
settings, average memory consumption is about 2/3 of hash.[ch], and
insertion is about twice as fast due to less frequent resizing.

Lookups are also slightly faster, because entries are strictly confined to
their bucket (i.e. no data of other buckets needs to be traversed).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-10 13:53:38 -05:00
Jens Lehmann
056439b7cd submodule: don't access the .gitmodules cache entry after removing it
Commit 5fee995244 introduced the stage_updated_gitmodules() function to
add submodule configuration updates to the index. It assumed that even
after calling remove_cache_entry_at() the same cache entry would still be
valid. This was true in the old days, as cache entries could never be
freed, but that is not so sure in the present as there is ongoing work to
free removed cache entries, which makes this code segfault.

Fix that by calling add_file_to_cache() instead of open coding it. Also
remove the "could not find .gitmodules in index" warning, as that won't
happen in regular use cases (and by then just silently adding it to the
index we do the right thing).

Thanks-to: Karsten Blees <karsten.blees@gmail.com>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-10 13:53:38 -05:00
Heiko Voigt
f79024590a Windows: Always normalize paths to Windows-style
It appears that `pwd` returns the POSIX-style or the DOS-style path
depending which style the previous `cd` used. To normalize, enforce `pwd
-W` in scripts.

From the original e-mail exchange:

On Thu, Mar 22, 2012 at 11:13:37AM +0100, Sebastian Schuberth wrote:
> On Wed, Mar 21, 2012 at 22:21, Johannes Sixt <j6t@kdbg.org> wrote:
>
> > I build git and run its tests outside the msysgit environment. Does that
> > explain the difference? (And I use CMD.)
>
> It does not make a difference for me. I started cmd.exe at
> c:\msysgit\git\t, added c:\msysgit\bin temporarily to PATH, and ran
> "sh t5526-fetch-submodules.sh -i -v", and the test still fails.

Yes it probably does. Johannes said that he runs the tests outside of
the msysgit folder. That way there is only one path the submodule script
gets reported and not two like '/c/msysgit/git' and '/git'.

That would explain to me why it is passing.

I am afraid that the only solution is to patch msys itself to report the
long absolute path when passing window style paths to cd. Currently when
I do

	cd c:/msysgit/git

I will end up in '/git' instead of the long path.

I found that there is a -W option to pwd in msys bash which makes it
always return the real windows path. A normalization in that direction
is unique and thus might be more robust. Have a look at the attached
patch. With this at least t5526 passes. I was not able to run the whole
testsuite properly at the moment. I can have a look at that tomorrow.

What do you think?

Cheers Heiko

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-04-10 13:53:37 -05:00
Karsten Blees
0e54eea33e MSVC: link dynamically to the CRT
Dynamic linking is generally preferred over static linking, and MSVCRT.dll
has been integral part of Windows for a long time.

This also fixes linker warnings for _malloc and _free in zlib.lib, which
seems to be compiled for MSVCRT.dll already.

The DLL version also exports some of the CRT initialization functions,
which are hidden in the static libcmt.lib (e.g. __wgetmainargs, required by
subsequent Unicode patches).

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-04-10 13:53:36 -05:00
Pat Thoyts
ae8281d888 wincred: add install target and avoid overwriting configured variables.
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
2014-04-10 13:53:36 -05:00
Heiko Voigt
f3f9ee9ea0 help: correct behavior for is_executable on Windows
The previous implementation said that the filesystem information on
Windows is not reliable to determine whether a file is executable.
To find gather this information it was peeking into the first two bytes
of a file to see whether it looks executable.
Apart from the fact that on Windows executables are usually defined as
such by their extension it lead to slow opening of help file in some
situations.

When you have virus scanner running calling open on an executable file
is a potentially expensive operation. See the following measurements (in
seconds) for example.

With virus scanner running (coldcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.412873
before open (git-annotate.exe): 0.000175
after open (git-annotate.exe): 0.397925
before open (git-apply.exe): 0.000243
after open (git-apply.exe): 0.399996
before open (git-archive.exe): 0.000147
after open (git-archive.exe): 0.397783
before open (git-bisect--helper.exe): 0.000160
after open (git-bisect--helper.exe): 0.397700
before open (git-blame.exe): 0.000160
after open (git-blame.exe): 0.399136
...

With virus scanner running (hotcache):

$ ./a.exe /libexec/git-core/
before open (git-add.exe): 0.000000
after open (git-add.exe): 0.000325
before open (git-annotate.exe): 0.000229
after open (git-annotate.exe): 0.000177
before open (git-apply.exe): 0.000167
after open (git-apply.exe): 0.000150
before open (git-archive.exe): 0.000154
after open (git-archive.exe): 0.000156
before open (git-bisect--helper.exe): 0.000132
after open (git-bisect--helper.exe): 0.000180
before open (git-blame.exe): 0.000718
after open (git-blame.exe): 0.000724
...

This test did just list the given directory and open() each file in it.

With this patch I get:

$ time git help git
Launching default browser to display HTML ...

real    0m8.723s
user    0m0.000s
sys     0m0.000s

and without

$ time git help git
Launching default browser to display HTML ...

real    1m37.734s
user    0m0.000s
sys     0m0.031s

both tests with cold cache and giving the machine some time to settle
down after restart.

Signed-off-by: Heiko Voigt <heiko.voigt@mahr.de>
2014-04-10 13:53:35 -05:00
Adam Roben
75036e7e9c Make non-.exe externals work again
7ebac8cb94 made launching of .exe
externals work when installed in Unicode paths. But it broke launching
of non-.exe externals, no matter where they were installed. We now
correctly maintain the UTF-8 and UTF-16 paths in tandem in lookup_prog.

This fixes t5526, among others.

Signed-off-by: Adam Roben <adam@roben.org>
2014-04-10 13:53:35 -05:00
Adam Roben
216ebe76f2 Fix launching of externals from Unicode paths
If Git were installed in a path containing non-ASCII characters,
commands such as git-am and git-submodule, which are implemented as
externals, would fail to launch with the following error:

> fatal: 'am' appears to be a git command, but we were not
> able to execute it. Maybe git-am is broken?

This was due to lookup_prog not being Unicode-aware. It was somehow
missed in 2ee5a1a14a.

Note that the only problem in this function was calling
GetFileAttributes instead of GetFileAttributesW. The calls to access()
were fine because access() is a macro which resolves to mingw_access,
which already handles Unicode correctly. But I changed lookup_prog to
use _waccess directly so that we only convert the path to UTF-16 once.

Signed-off-by: Adam Roben <adam@roben.org>
2014-04-10 13:53:35 -05:00
Evgeny Pashkin
37a22b4c02 Fixed wrong path delimiter in exe finding
On Windows XP3 in git bash
git clone git@github.com:octocat/Spoon-Knife.git
cd Spoon-Knife
git gui
menu Remote\Fetch from\origin
error: cannot spawn git: No such file or directory
error: could not run rev-list

if u run
git fetch --all
it worked normal in git bash or gitgui tools

In second version CreateProcess get 'C:\Git\libexec\git-core/git.exe' in
first version - C:/Git/libexec/git-core/git.exe and not executes (unix
slashes)

after fixing C:\Git\libexec\git-core\git.exe or
C:/Git/libexec/git-core\git.exe it works normal

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-04-10 13:53:35 -05:00
Johannes Schindelin
688cf6e0aa git stash: make sure that .git/logs/refs/ exists
If the user has not activated reflogs, or if nothing has been recorded
yet (as is the case directly after cloning), said directory may not
exist yet.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-04-10 13:53:34 -05:00
Johannes Schindelin
c64a221e53 Handle http.* config variables pointing to files gracefully on Windows
On Windows, we would like to be able to have a default http.sslCAinfo
that points to an MSys path (i.e. relative to the installation root of
Git).  As Git is a MinGW program, it has to handle the conversion
of the MSys path into a MinGW32 path itself.

Since system_path() considers paths starting with '/' as absolute, we
have to convince it to make a Windows path by stripping the leading
slash.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-04-10 13:53:34 -05:00
Johannes Schindelin
a77f04ddf6 Always auto-gc after calling a fast-import transport
After importing anything with fast-import, we should always let the
garbage collector do its job, since the objects are written to disk
inefficiently.

This brings down an initial import of http://selenic.com/hg from about
230 megabytes to about 14.

In the future, we may want to make this configurable on a per-remote
basis, or maybe teach fast-import about it in the first place.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-04-10 13:53:33 -05:00
Sverre Rabbelier
9e54e5234a remote-helper: check helper status after import/export
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
2014-04-10 13:53:33 -05:00
Sverre Rabbelier
e1b8d093eb transport-helper: add trailing --
[PT: ensure we add an additional element to the argv array]
2014-04-10 13:53:33 -05:00