Commit Graph

429 Commits

Author SHA1 Message Date
Karsten Blees
c92ed50ddb Win32: support long paths
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).

Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.

Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).

Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).

Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).

Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.

Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.

While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.

Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 12:42:42 -06:00
Johannes Schindelin
64d6324076 Merge remote-tracking branch 'kblees/kb/fscache-v4-tentative-1.8.5' into thicket-1.8.5.2 2014-02-17 12:42:40 -06:00
Johannes Schindelin
3acbf28e15 Merge 'poll-busy-wait' into HEAD 2014-02-17 12:42:39 -06:00
Johannes Schindelin
6f49259561 Merge 'fix-externals' into HEAD 2014-02-17 12:42:39 -06:00
Karsten Blees
d02af3a681 Win32: add a cache below mingw's lstat and dirent implementations
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.

Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.

Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.

The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.

Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 12:42:34 -06:00
Johannes Schindelin
ee2e7acdaf Merge 'unc' into HEAD 2014-02-17 12:42:25 -06:00
Karsten Blees
404c0fd180 Win32: make the lstat implementation pluggable
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.

Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 12:39:25 -06:00
Karsten Blees
eecc236696 Win32: Make the dirent implementation pluggable
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.

Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.

Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 12:39:25 -06:00
Karsten Blees
38007080a9 Win32: dirent.c: Move opendir down
Move opendir down in preparation for the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 12:36:36 -06:00
Karsten Blees
f2c95bfa14 Win32: make FILETIME conversion functions public
Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 12:36:20 -06:00
theoleblond
b3f3e98ec5 Sleep 1 millisecond in poll() to avoid busy wait
I played around with this quite a bit. After trying some more complex
schemes, I found that what worked best is to just sleep 1 millisecond
between iterations. Though it's a very short time, it still completely
eliminates the busy wait condition, without hurting perf.

There code uses SleepEx(1, TRUE) to sleep. See this page for a good
discussion of why that is better than calling SwitchToThread, which
is what was used previously:
http://stackoverflow.com/questions/1383943/switchtothread-vs-sleep1

Note that calling SleepEx(0, TRUE) does *not* solve the busy wait.

The most striking case was when testing on a UNC share with a large repo,
on a single CPU machine. Without the fix, it took 4 minutes 15 seconds,
and with the fix it took just 1:08! I think it's because git-upload-pack's
busy wait was eating the CPU away from the git process that's doing the
real work. With multi-proc, the timing is not much different, but tons of
CPU time is still wasted, which can be a killer on a server that needs to
do bunch of other things.

I also tested the very fast local case, and didn't see any measurable
difference. On a big repo with 4500 files, the upload-pack took about 2
seconds with and without the fix.
2014-02-17 12:08:16 -06:00
Adam Roben
4b5b45019e Make non-.exe externals work again
7ebac8cb94 made launching of .exe
externals work when installed in Unicode paths. But it broke launching
of non-.exe externals, no matter where they were installed. We now
correctly maintain the UTF-8 and UTF-16 paths in tandem in lookup_prog.

This fixes t5526, among others.

Signed-off-by: Adam Roben <adam@roben.org>
2014-02-17 12:08:14 -06:00
Adam Roben
5d88db7b01 Fix launching of externals from Unicode paths
If Git were installed in a path containing non-ASCII characters,
commands such as git-am and git-submodule, which are implemented as
externals, would fail to launch with the following error:

> fatal: 'am' appears to be a git command, but we were not
> able to execute it. Maybe git-am is broken?

This was due to lookup_prog not being Unicode-aware. It was somehow
missed in 2ee5a1a14a.

Note that the only problem in this function was calling
GetFileAttributes instead of GetFileAttributesW. The calls to access()
were fine because access() is a macro which resolves to mingw_access,
which already handles Unicode correctly. But I changed lookup_prog to
use _waccess directly so that we only convert the path to UTF-16 once.

Signed-off-by: Adam Roben <adam@roben.org>
2014-02-17 12:08:13 -06:00
Evgeny Pashkin
5cc7c27b4f Fixed wrong path delimiter in exe finding
On Windows XP3 in git bash
git clone git@github.com:octocat/Spoon-Knife.git
cd Spoon-Knife
git gui
menu Remote\Fetch from\origin
error: cannot spawn git: No such file or directory
error: could not run rev-list

if u run
git fetch --all
it worked normal in git bash or gitgui tools

In second version CreateProcess get 'C:\Git\libexec\git-core/git.exe' in
first version - C:/Git/libexec/git-core/git.exe and not executes (unix
slashes)

after fixing C:\Git\libexec\git-core\git.exe or
C:/Git/libexec/git-core\git.exe it works normal

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-02-17 12:08:13 -06:00
Eric Sunshine
a12e3d4238 Make mingw_offset_1st_component() behave consistently for all paths.
mingw_offset_1st_component() returns "foo" for inputs "/foo" and
"c:/foo", but inconsistently returns "/foo" for UNC input
"/machine/share/foo".  Fix it to return "foo" for all cases.

Reference: http://groups.google.com/group/msysgit/browse_thread/thread/c0af578549b5dda0

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-02-17 11:57:35 -06:00
Cezary Zawadka
3665d26648 Allow using UNC path for git repository
[efl: moved MinGW-specific part to compat/]

[jes: fixed compilation on non-Windows]

Signed-off-by: Cezary Zawadka <czawadka@gmail.com>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-02-17 11:57:35 -06:00
Johannes Schindelin
8b9c4d2186 Add a Windows-specific fallback to getenv("HOME");
This fixes msysGit issue 482 properly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-02-17 11:57:34 -06:00
Johannes Schindelin
44e9841a66 When initializing .git/, record the current setting of core.hideDotFiles
This is on Windows only, of course.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-02-17 11:57:34 -06:00
Erik Faye-Lund
dcec85f2fd core.hidedotfiles: hide '.git' dir by default
At least for cross-platform projects, it makes sense to hide the
files starting with a dot, as this is the behavior on Unix/MacOSX.

However, at least Eclipse has problems interpreting the hidden flag
correctly, so the default is to hide only the .git/ directory.

The config setting core.hideDotFiles therefore supports not only
'true' and 'false', but also 'dotGitOnly'.

[jes: clarified the commit message, made git init respect the setting
by marking the .git/ directory only after reading the config, and added
documentation, and rebased on top of current junio/next]

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-02-17 11:57:34 -06:00
Karsten Blees
59f4b9258b Win32: fix segfault in WriteConsoleW when debugging in gdb
On Windows XP (not Win7), WriteConsoleW and WriteFile seem to raise and
catch SIGSEGV if the lpNumberOfCharsWritten parameter is NULL. This is not
a problem when executed standalone, but gdb stops execution here (unless
disabled via "handle SIGSEGV nostop").

Fix it by passing a dummy variable.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:33 -06:00
Karsten Blees
fcc0628adb Win32: fix broken pipe detection
As of "Win32: Thread-safe windows console output", git-log no longer
terminates when the pager process dies. This is due to disabling buffering
for the replaced stdout / stderr streams. Git-log will periodically fflush
stdout (see write_or_die.c/mayble_flush_or_die()), but with no buffering,
this is a NOP that always succeeds (so we never detect the EPIPE error).

Exchange the original console handles with our console thread pipe handles
by accessing the internal MSVCRT data structures directly (which are
exposed via __pioinfo for some reason).

Implement this with minimal assumptions about the actual data structure to
make it work with different (hopefully even future) MSVCRT versions.

While messing with internal data structures is ugly, this patch solves the
problem at the source instead of adding more workarounds. We no longer need
the special winansi_isatty override, and the limitations documented in
"Win32: Thread-safe windows console output" are gone (i.e. fdopen(1/2)
returns unbuffered streams now, and isatty() for duped console file
descriptors works as expected).

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:33 -06:00
Karsten Blees
2966e8de94 Win32: fix detection of empty directories in is_dir_empty
On Windows XP (not Win7), directories cannot be deleted while a find handle
is open, causing "Deletion of directory '...' failed. Should I try again?"
prompts.

Prior to 19d1e75d "Win32: Unicode file name support (except dirent)",
these failures were silently ignored due to strbuf_free in is_dir_empty
resetting GetLastError to ERROR_SUCCESS.

Close the find handle in is_dir_empty so that git doesn't block deletion
of the directory even after all other applications have released it.

Reported-by: John Chen <john0312@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:32 -06:00
Karsten Blees
335b4f29fa Win32: patch Windows environment on startup
Fix Windows specific environment settings on startup rather than checking
for special values on every getenv call.

As a side effect, this makes the patched environment (i.e. with properly
initialized TMPDIR and TERM) available to child processes.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:32 -06:00
Karsten Blees
3544f7e270 Win32: keep the environment sorted
The Windows environment is sorted, keep it that way for O(log n)
environment access.

Change compareenv to compare only the keys, so that it can be used to
find an entry irrespective of the value.

Change lookupenv to binary seach for an entry. Return one's complement of
the insert position if not found (libc's bsearch returns NULL).

Replace MSVCRT's getenv with a minimal do_getenv based on the binary search
function.

Change do_putenv to insert new entries at the correct position. Simplify
the function by swapping if conditions and using memmove instead of for
loops.

Move qsort from make_environment_block to mingw_startup. We still need to
sort on startup to make sure that the environment is sorted according to
our compareenv function (while Win32 / CreateProcess requires the
environment block to be sorted case-insensitively, CreateProcess currently
doesn't enforce this, and some applications such as bash just don't care).

Note that environment functions are _not_ thread-safe and are not required
to be so by POSIX, the application is responsible for synchronizing access
to the environment. MSVCRT's getenv and our new getenv implementation are
better than that in that they are thread-safe with respect to other getenv
calls as long as the environment is not modified. Git's indiscriminate use
of getenv in background threads currently requires this property.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:32 -06:00
Karsten Blees
64488c9db7 Win32: use low-level memory allocation during initialization
As of d41489a6 "Add more large blob test cases", git's high-level memory
allocation functions (xmalloc, xmemdupz etc.) access the environment to
simulate limited memory in tests (see 'getenv("GIT_ALLOC_LIMIT")' in
memory_limit_check()). These functions should not be used before the
environment is fully initialized (particularly not to initialize the
environment itself).

The current solution ('environ = NULL; ALLOC_GROW(environ...)') only works
because MSVCRT's getenv() reinitializes environ when it is NULL (i.e. it
leaves us with two sets of unusabe (non-UTF-8) and unfreeable (CRT-
allocated) environments).

Add our own set of malloc-or-die functions to be used in startup code.

Also check the result of __wgetmainargs, which may fail if there's not
enough memory for wide-char arguments and environment.

This patch is in preparation of the sorted environment feature, which
completely replaces MSVCRT's getenv() implementation.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:32 -06:00
Karsten Blees
5399393a5c Win32: reduce environment array reallocations
Move environment array reallocation from do_putenv to the respective
callers. Keep track of the environment size in a global variable. Use
ALLOC_GROW in mingw_putenv to reduce reallocations. Allocate a
sufficiently sized environment array in make_environment_block to prevent
reallocations.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:32 -06:00
Karsten Blees
fb770a8f0b Win32: don't copy the environment twice when spawning child processes
When spawning child processes via start_command(), the environment and all
environment entries are copied twice. First by make_augmented_environ /
copy_environ to merge with child_process.env. Then a second time by
make_environment_block to create a sorted environment block string as
required by CreateProcess.

Move the merge logic to make_environment_block so that we only need to copy
the environment once. This changes semantics of the env parameter: it now
expects a delta (such as child_process.env) rather than a full environment.
This is not a problem as the parameter is only used by start_command()
(all other callers previously passed char **environ, and now pass NULL).

The merge logic no longer xstrdup()s the environment strings, so do_putenv
must not free them. Add a parameter to distinguish this from normal putenv.

Remove the now unused make_augmented_environ / free_environ API.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:32 -06:00
Karsten Blees
a750a1cd06 Win32: factor out environment block creation
Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:31 -06:00
Karsten Blees
9a29fd23f2 Win32: unify environment function names
Environment helper functions use random naming ('env' prefix or suffix or
both, with or without '_'). Change to POSIX naming scheme ('env' suffix,
no '_').

Env_setenv has more in common with putenv than setenv. Change to do_putenv.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:31 -06:00
Karsten Blees
7f6952bbc6 Win32: move environment functions
Move environment helper functions up so that they can be reused by
mingw_getenv and mingw_spawnve_fd in subsequent patches.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:31 -06:00
Karsten Blees
78333c8ce2 Win32: simplify internal mingw_spawn* APIs
The only public spawn function that needs to tweak the environment is
mingw_spawnvpe (called from start_command). Nevertheless, all internal
spawn* functions take an env parameter and needlessly pass the global
char **environ around. Remove the env parameter where it's not needed.

This removes the internal mingw_execve abstraction, which is no longer
needed.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:31 -06:00
Johannes Schindelin
312900b1cd Let mingw_execve() return an int
This is in the great tradition of POSIX. Original fix by Olivier Refalo.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-02-17 11:57:31 -06:00
Karsten Blees
311a3b80a1 Win32: unify environment case-sensitivity
The environment on Windows is case-insensitive. Some environment functions
(such as unsetenv and make_augmented_environ) have always used case-
sensitive comparisons instead, while others (getenv, putenv, sorting in
spawn*) were case-insensitive.

Prevent potential inconsistencies by using case-insensitive comparison in
lookup_env (used by putenv, unsetenv and make_augmented_environ).

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:30 -06:00
Karsten Blees
757aa88789 Win32: fix environment memory leaks
All functions that modify the environment have memory leaks.

Disable gitunsetenv in the Makefile and use env_setenv (via mingw_putenv)
instead (this frees removed environment entries).

Move xstrdup from env_setenv to make_augmented_environ, so that
mingw_putenv no longer copies the environment entries (according to POSIX
[1], "the string [...] shall become part of the environment"). This also
fixes the memory leak in gitsetenv, which expects a POSIX compliant putenv.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/putenv.html

Note: This patch depends on taking control of char **environ and having
our own mingw_putenv (both introduced in "Win32: Unicode environment
(incoming)").

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:30 -06:00
Karsten Blees
e24d790804 Win32: Unicode environment (incoming)
Convert environment from UTF-16 to UTF-8 on startup.

No changes to getenv() are necessary, as the MSVCRT version is implemented
on top of char **environ.

However, putenv / _wputenv from MSVCRT no longer work, for two reasons:
1. they try to keep environ, _wenviron and the Win32 process environment
in sync, using the default system encoding instead of UTF-8 to convert
between charsets
2. msysgit and MSVCRT use different allocators, memory allocated in git
cannot be freed by the CRT and vice versa

Implement mingw_putenv using the env_setenv helper function from the
environment merge code.

Note that in case of memory allocation failure, putenv now dies with error
message (due to xrealloc) instead of failing with ENOMEM. As git assumes
setenv / putenv to always succeed, this prevents it from continuing with
incorrect settings.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:30 -06:00
Karsten Blees
b1a20cff10 Win32: Unicode environment (outgoing)
Convert environment from UTF-8 to UTF-16 when creating other processes.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:30 -06:00
Karsten Blees
15a105a647 Win32: sync Unicode console output and file system
Use the same Unicode conversion functions for file names and console
conversions so that the file system and console output are in sync when
checking out legacy encoded repositories (i.e. with invalid UTF-8 file
names).

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:30 -06:00
Karsten Blees
e368171bc7 Win32: Unicode arguments (incoming)
Convert command line arguments from UTF-16 to UTF-8 on startup.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:29 -06:00
Karsten Blees
c3b0718181 Win32: Unicode arguments (outgoing)
Convert command line arguments from UTF-8 to UTF-16 when creating other
processes.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:57:29 -06:00
Karsten Blees
b992ba857b Win32: Unicode file name support (dirent)
Changes opendir/readdir to use Windows Unicode APIs and convert between
UTF-8/UTF-16.

Removes parameter checks that are already covered by xutftowcs_path. This
changes detection of ENAMETOOLONG from MAX_PATH - 2 to MAX_PATH (matching
is_dir_empty in mingw.c). If name + "/*" or the resulting absolute path is
too long, FindFirstFile fails and errno is set through err_win_to_posix.

Increases the size of dirent.d_name to accommodate the full
WIN32_FIND_DATA.cFileName converted to UTF-8 (UTF-16 to UTF-8 conversion
may grow by factor three in the worst case).

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:45:59 -06:00
Karsten Blees
366a438009 Win32: Unicode file name support (except dirent)
Replaces Windows "ANSI" APIs dealing with file- or path names with their
Unicode equivalent, adding UTF-8/UTF-16LE conversion as necessary.

The dirent API (opendir/readdir/closedir) is updated in a separate commit.

Adds trivial wrappers for access, chmod and chdir.

Adds wrapper for mktemp (needed for both mkstemp and mkdtemp).

The simplest way to convert a repository with legacy-encoded (e.g. Cp1252)
file names to UTF-8 ist to checkout with an old msysgit version and
"git add --all & git commit" with the new version.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:45:58 -06:00
Karsten Blees
a70f64639d Win32: add Unicode conversion functions
Add Unicode conversion functions to convert between Windows native UTF-16LE
encoding to UTF-8 and back.

To support repositories with legacy-encoded file names, the UTF-8 to UTF-16
conversion function tries to create valid, unique file names even for
invalid UTF-8 byte sequences, so that these repositories can be checked out
without error.

The current implementation leaves invalid UTF-8 bytes in range 0xa0 - 0xff
as is (producing printable Unicode chars \u00a0 - \u00ff, equivalent to
ISO-8859-1), and converts 0x80 - 0x9f to hex-code (\u0080 - \u009f are
control chars).

The Windows MultiByteToWideChar API was not used as it either drops invalid
UTF-8 sequences (on Win2k/XP; producing non-unique or even empty file
names) or converts them to the replacement char \ufffd (Vista/7; causing
ERROR_INVALID_NAME in subsequent calls to file system APIs).

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:45:58 -06:00
Karsten Blees
955591ab03 Win32: Thread-safe windows console output
Winansi.c has many static variables that are accessed and modified from
the [v][f]printf / fputs functions overridden in the file. This may cause
multi threaded git commands that print to the console to produce corrupted
output or even crash.

Additionally, winansi.c doesn't override all functions that can be used to
print to the console (e.g. fwrite, write, fputc are missing), so that ANSI
escapes don't work properly for some git commands (e.g. git-grep).

Instead of doing ANSI emulation in just a few wrapped functions on top of
the IO API, let's plug into the IO system and take advantage of the thread
safety inherent to the IO system.

Redirect stdout and stderr to a pipe if they point to the console. A
background thread reads from the pipe, handles ANSI escape sequences and
UTF-8 to UTF-16 conversion, then writes to the console.

The pipe-based stdout and stderr replacements must be set to unbuffered, as
MSVCRT doesn't support line buffering and fully buffered streams are
inappropriate for console output.

Due to the byte-oriented pipe, ANSI escape sequences and multi-byte UTF-8
sequences can no longer be expected to arrive in one piece. Replace the
string-based ansi_emulate() with a simple stateful parser (this also fixes
colored diff hunk headers, which were broken as of commit 2efcc977).

Override isatty to return true for the pipes redirecting to the console.

Exec/spawn obtain the original console handle to pass to the next process
via winansi_get_osfhandle().

All other overrides are gone, the default stdio implementations work as
expected with the piped stdout/stderr descriptors.

Global variables are either initialized on startup (single threaded) or
exclusively modified by the background thread. Threads communicate through
the pipe, no further synchronization is necessary.

The background thread is terminated by disonnecting the pipe after flushing
the stdio and pipe buffers. This doesn't work for anonymous pipes (created
via CreatePipe), as DisconnectNamedPipe only works on the read end, which
discards remaining data. Thus we have to setup the pipe manually, with the
write end beeing the server (opened with CreateNamedPipe) and the read end
the client (opened with CreateFile).

Limitations: doesn't track reopened or duped file descriptors, i.e.:
- fdopen(1/2) returns fully buffered streams
- dup(1/2), dup2(1/2) returns normal pipe descriptors (i.e. isatty() =
  false, winansi_get_osfhandle won't return the original console handle)

Currently, only the git-format-patch command uses xfdopen(xdup(1)) (see
"realstdout" in builtin/log.c), but works well with these limitations.

Many thanks to Atsushi Nakagawa <atnak@chejz.com> for suggesting and
reviewing the thread-exit-mechanism.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:45:58 -06:00
Karsten Blees
05dcd4994b MSVC: fix winansi.c compile errors
Some constants (such as LF_FACESIZE) are undefined with -DNOGDI (set in the
Makefile), and CONSOLE_FONT_INFOEX is available in MSVC, but not in MinGW.

Cast FARPROC to PGETCURRENTCONSOLEFONTEX to suppress MSVC compiler warning.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-02-17 11:45:57 -06:00
Karsten Blees
984ec4d981 Enable color output in Windows cmd.exe
Git requires the TERM environment variable to be set for all color*
settings. Simulate the TERM variable if it is not set (default on Windows).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2014-02-17 11:45:57 -06:00
Karsten Blees
9443c42ef1 Revert "mingw.c: move definition of mingw_getenv down"
This reverts commit 06bc4b796a.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:45:57 -06:00
Karsten Blees
37d6639fa8 Revert "Windows: teach getenv to do a case-sensitive search"
This reverts commit df599e9612.

As of 5e9637c6 "i18n: add infrastructure for translating Git with gettext",
eval_gettext uses MinGW envsubst.exe instead of git-sh-i18n--envsubst.exe
for variable substitution. This breaks git-submodule.sh messages and tests,
as envsubst.exe doesn't support case-sensitive environment lookup (the same
is true for almost everything on Windows, including MSys and Cygwin tools).

30a615ac "Windows/i18n: rename $path to prevent clashes with $PATH" renames
the conflicting variable in git-submodule.sh, so that it works on Windows
(i.e. with case-insensitive environment, regardless of the toolset).

Revert to the documented behaviour of case-insensitive environment on
Windows.

Signed-off-by: Karsten Blees <blees@dcon.de>
2014-02-17 11:45:57 -06:00
Karsten Blees
d0d28b8065 Unicode console: fix font warning on Vista and Win7
GetCurrentConsoleFontEx in an atexit routine doesn't work because git
closes stdout before exit (which also closes the console handle). Check
the console font when we first encounter a non-ascii character and only
schedule the warning message to be printed at exit (warnings go to stderr,
which is not closed by git).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
2014-02-17 11:45:57 -06:00
Karsten Blees
2df1133668 MinGW: disable CRT command line globbing
MingwRT listens to _CRT_glob to decide if __getmainargs should
perform globbing, with the default being that it should.
Unfortunately, __getmainargs globbing is sub-par; for instance
patterns like "*.c" will only match c-sources in the current
directory.

Disable __getmainargs' command line wildcard expansion, so these
patterns will be left untouched, and handled by Git's superior
built-in globbing instead.

MSVC defaults to no globbing, so we don't need to do anything
in that case.

This fixes t5505 and t7810.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
2014-02-17 11:45:56 -06:00
Karsten Blees
1ff72e7607 Win32: move main macro to a function
The code in the MinGW main macro is getting more and more complex, move to
a separate initialization function for readabiliy and extensibility.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
2014-02-17 11:45:56 -06:00