Changes opendir/readdir to use Windows Unicode APIs and convert between
UTF-8/UTF-16.
Removes parameter checks that are already covered by xutftowcs_path. This
changes detection of ENAMETOOLONG from MAX_PATH - 2 to MAX_PATH (matching
is_dir_empty in mingw.c). If name + "/*" or the resulting absolute path is
too long, FindFirstFile fails and errno is set through err_win_to_posix.
Increases the size of dirent.d_name to accommodate the full
WIN32_FIND_DATA.cFileName converted to UTF-8 (UTF-16 to UTF-8 conversion
may grow by factor three in the worst case).
Signed-off-by: Karsten Blees <blees@dcon.de>
If Git were installed in a path containing non-ASCII characters,
commands such as git-am and git-submodule, which are implemented as
externals, would fail to launch with the following error:
> fatal: 'am' appears to be a git command, but we were not
> able to execute it. Maybe git-am is broken?
This was due to lookup_prog not being Unicode-aware. It was somehow
missed in 2ee5a1a14a.
Note that the only problem in this function was calling
GetFileAttributes instead of GetFileAttributesW. The calls to access()
were fine because access() is a macro which resolves to mingw_access,
which already handles Unicode correctly. But I changed lookup_prog to
use _waccess directly so that we only convert the path to UTF-16 once.
Signed-off-by: Adam Roben <adam@roben.org>
Replaces Windows "ANSI" APIs dealing with file- or path names with their
Unicode equivalent, adding UTF-8/UTF-16LE conversion as necessary.
The dirent API (opendir/readdir/closedir) is updated in a separate commit.
Adds trivial wrappers for access, chmod and chdir.
Adds wrapper for mktemp (needed for both mkstemp and mkdtemp).
The simplest way to convert a repository with legacy-encoded (e.g. Cp1252)
file names to UTF-8 ist to checkout with an old msysgit version and
"git add --all & git commit" with the new version.
Signed-off-by: Karsten Blees <blees@dcon.de>
Add Unicode conversion functions to convert between Windows native UTF-16LE
encoding to UTF-8 and back.
To support repositories with legacy-encoded file names, the UTF-8 to UTF-16
conversion function tries to create valid, unique file names even for
invalid UTF-8 byte sequences, so that these repositories can be checked out
without error.
The current implementation leaves invalid UTF-8 bytes in range 0xa0 - 0xff
as is (producing printable Unicode chars \u00a0 - \u00ff, equivalent to
ISO-8859-1), and converts 0x80 - 0x9f to hex-code (\u0080 - \u009f are
control chars).
The Windows MultiByteToWideChar API was not used as it either drops invalid
UTF-8 sequences (on Win2k/XP; producing non-unique or even empty file
names) or converts them to the replacement char \ufffd (Vista/7; causing
ERROR_INVALID_NAME in subsequent calls to file system APIs).
Signed-off-by: Karsten Blees <blees@dcon.de>
Winansi.c has many static variables that are accessed and modified from
the [v][f]printf / fputs functions overridden in the file. This may cause
multi threaded git commands that print to the console to produce corrupted
output or even crash.
Additionally, winansi.c doesn't override all functions that can be used to
print to the console (e.g. fwrite, write, fputc are missing), so that ANSI
escapes don't work properly for some git commands (e.g. git-grep).
Instead of doing ANSI emulation in just a few wrapped functions on top of
the IO API, let's plug into the IO system and take advantage of the thread
safety inherent to the IO system.
Redirect stdout and stderr to a pipe if they point to the console. A
background thread reads from the pipe, handles ANSI escape sequences and
UTF-8 to UTF-16 conversion, then writes to the console.
The pipe-based stdout and stderr replacements must be set to unbuffered, as
MSVCRT doesn't support line buffering and fully buffered streams are
inappropriate for console output.
Due to the byte-oriented pipe, ANSI escape sequences and multi-byte UTF-8
sequences can no longer be expected to arrive in one piece. Replace the
string-based ansi_emulate() with a simple stateful parser (this also fixes
colored diff hunk headers, which were broken as of commit 2efcc977).
Override isatty to return true for the pipes redirecting to the console.
Exec/spawn obtain the original console handle to pass to the next process
via winansi_get_osfhandle().
All other overrides are gone, the default stdio implementations work as
expected with the piped stdout/stderr descriptors.
Global variables are either initialized on startup (single threaded) or
exclusively modified by the background thread. Threads communicate through
the pipe, no further synchronization is necessary.
The background thread is terminated by disonnecting the pipe after flushing
the stdio and pipe buffers. This doesn't work for anonymous pipes (created
via CreatePipe), as DisconnectNamedPipe only works on the read end, which
discards remaining data. Thus we have to setup the pipe manually, with the
write end beeing the server (opened with CreateNamedPipe) and the read end
the client (opened with CreateFile).
Limitations: doesn't track reopened or duped file descriptors, i.e.:
- fdopen(1/2) returns fully buffered streams
- dup(1/2), dup2(1/2) returns normal pipe descriptors (i.e. isatty() =
false, winansi_get_osfhandle won't return the original console handle)
Currently, only the git-format-patch command uses xfdopen(xdup(1)) (see
"realstdout" in builtin/log.c), but works well with these limitations.
Many thanks to Atsushi Nakagawa <atnak@chejz.com> for suggesting and
reviewing the thread-exit-mechanism.
Signed-off-by: Karsten Blees <blees@dcon.de>
Some constants (such as LF_FACESIZE) are undefined with -DNOGDI (set in the
Makefile), and CONSOLE_FONT_INFOEX is available in MSVC, but not in MinGW.
Cast FARPROC to PGETCURRENTCONSOLEFONTEX to suppress MSVC compiler warning.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This reverts commit df599e9612.
As of 5e9637c6 "i18n: add infrastructure for translating Git with gettext",
eval_gettext uses MinGW envsubst.exe instead of git-sh-i18n--envsubst.exe
for variable substitution. This breaks git-submodule.sh messages and tests,
as envsubst.exe doesn't support case-sensitive environment lookup (the same
is true for almost everything on Windows, including MSys and Cygwin tools).
30a615ac "Windows/i18n: rename $path to prevent clashes with $PATH" renames
the conflicting variable in git-submodule.sh, so that it works on Windows
(i.e. with case-insensitive environment, regardless of the toolset).
Revert to the documented behaviour of case-insensitive environment on
Windows.
Signed-off-by: Karsten Blees <blees@dcon.de>
GetCurrentConsoleFontEx in an atexit routine doesn't work because git
closes stdout before exit (which also closes the console handle). Check
the console font when we first encounter a non-ascii character and only
schedule the warning message to be printed at exit (warnings go to stderr,
which is not closed by git).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
MingwRT listens to _CRT_glob to decide if __getmainargs should
perform globbing, with the default being that it should.
Unfortunately, __getmainargs globbing is sub-par; for instance
patterns like "*.c" will only match c-sources in the current
directory.
Disable __getmainargs' command line wildcard expansion, so these
patterns will be left untouched, and handled by Git's superior
built-in globbing instead.
MSVC defaults to no globbing, so we don't need to do anything
in that case.
This fixes t5505 and t7810.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
The code in the MinGW main macro is getting more and more complex, move to
a separate initialization function for readabiliy and extensibility.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Improve the dirent implementation by removing the relics that were once
necessary to plug into the now unused MinGW runtime, in preparation for
Unicode file name support.
Move FindFirstFile to opendir, and FindClose to closedir, with the
following implications:
- DIR.dd_name is no longer needed
- chdir(one); opendir(relative); chdir(two); readdir() works as expected
(i.e. lists one/relative instead of two/relative)
- DIR.dd_handle is a valid handle for the entire lifetime of the DIR struct
- thus, all checks for dd_handle == INVALID_HANDLE_VALUE and dd_handle == 0
have been removed
- the special case that the directory has been fully read (which was
previously explicitly tracked with dd_handle == INVALID_HANDLE_VALUE &&
dd_stat != 0) is now handled implicitly by the FindNextFile error
handling code (if a client continues to call readdir after receiving
NULL, FindNextFile will continue to fail with ERROR_NO_MORE_FILES, to
the same effect)
- extracting dirent data from WIN32_FIND_DATA is needed in two places, so
moved to its own method
- GetFileAttributes is no longer needed. The same information can be
obtained from the FindFirstFile error code, which is ERROR_DIRECTORY if
the name is NOT a directory (-> ENOTDIR), otherwise we can use
err_win_to_posix (e.g. ERROR_PATH_NOT_FOUND -> ENOENT). The
ERROR_DIRECTORY case could be fixed in err_win_to_posix, but this
probably breaks other functionality.
Removes the ERROR_NO_MORE_FILES check after FindFirstFile (this was
fortunately a NOOP (searching for '*' always finds '.' and '..'),
otherwise the subsequent code would have copied data from an uninitialized
buffer).
Changes malloc to git support function xmalloc, so opendir will die() if
out of memory, rather than failing with ENOMEM and letting git work on
incomplete directory listings (error handling in dir.c is quite sparse).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Git-compat-util.h is two dirs up, and already includes <dirent.h> (which
is the same as "dirent.h" due to -Icompat/win32 in the Makefile).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
FILENAME_MAX and MAX_PATH are both 260 on Windows, however, MAX_PATH is
used throughout the other Win32 code in Git, and also defines the length
of file name buffers in the Win32 API (e.g. WIN32_FIND_DATA.cFileName,
from which we're copying the dirent data).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Remove the union around dirent.d_type and the unused dirent.d_reclen member
(which was necessary for compatibility with the MinGW dirent runtime, which
is no longer used).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
There are no proper inodes on Windows, so remove dirent.d_ino and #define
NO_D_INO_IN_DIRENT in the Makefile (this skips e.g. an ineffective qsort in
fsck.c).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Unicode console output won't display correctly with default settings
because the default console font ("Terminal") only supports the system's
OEM charset. Unfortunately, this is a user specific setting, so it cannot
be easily fixed by e.g. some registry tricks in the setup program.
This change prints a warning on exit if console output contained non-ascii
characters and the console font is supposedly not a TrueType font (which
usually have decent Unicode support).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
GetStdHandle(STD_OUTPUT_HANDLE) doesn't work for stderr if stdout is
redirected. Use _get_osfhandle of the FILE* instead.
_isatty() is true for all character devices (including parallel and serial
ports). Check return value of GetConsoleScreenBufferInfo instead to
reliably detect console handles (also don't initialize internal state from
an uninitialized CONSOLE_SCREEN_BUFFER_INFO structure if the function
fails).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
WriteConsoleW seems to be the only way to reliably print unicode to the
console (without weird code page conversions).
Also redirects vfprintf to the winansi.c version.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git requires the TERM environment variable to be set for all color*
settings. Simulate the TERM variable if it is not set (default on Windows).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows XP3 in git bash
git clone git@github.com:octocat/Spoon-Knife.git
cd Spoon-Knife
git gui
menu Remote\Fetch from\origin
error: cannot spawn git: No such file or directory
error: could not run rev-list
if u run
git fetch --all
it worked normal in git bash or gitgui tools
In second version CreateProcess get 'C:\Git\libexec\git-core/git.exe' in
first version - C:/Git/libexec/git-core/git.exe and not executes (unix
slashes)
after fixing C:\Git\libexec\git-core\git.exe or
C:/Git/libexec/git-core\git.exe it works normal
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As reported in msysGit issue 450 the recent change to set the windows
hidden attribute on the .git directory is being applied to bare git
directories. This patch excludes bare repositories.
Tested-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
At least for cross-platform projects, it makes sense to hide the
files starting with a dot, as this is the behavior on Unix/MacOSX.
However, at least Eclipse has problems interpreting the hidden flag
correctly, so the default is to hide only the .git/ directory.
The config setting core.hideDotFiles therefore supports not only
'true' and 'false', but also 'dotGitOnly'.
[jes: clarified the commit message, made git init respect the setting
by marking the .git/ directory only after reading the config, and added
documentation, and rebased on top of current junio/next]
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This way it just got added to gnulib too the other day.
Signed-off-by: Joachim Schmitz <jojo@schmitz-digital.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If poll() is used as a milli-second sleep, like in help.c, by passing a NULL
in the 1st and a 0 in the 2nd arg, it exits with EFAULT.
As per Paolo Bonzini, the original author, this is a bug and to be fixed
Like in this commit, which is not to exit if the 2nd arg is 0. It got fixed
In gnulib in the same manner the other day.
Signed-off-by: Joachim Schmitz <jojo@schmitz-digital.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order for non-win32 platforms to be able to use poll.c, #ifdef the
inclusion of two header files properly
Signed-off-by: Joachim Schmitz <jojo@schmitz-digital.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
move poll.[ch] out of compat/win32/ into compat/poll/ and adjust
Makefile with the changed paths. Adding comments to Makefile about
how/when to enable it and add logic for this
Signed-off-by: Joachim Schmitz <jojo@schmitz-digital.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some mkdir(2) implementations do not want to see trailing slash in
its parameter.
* js/compat-mkdir:
compat: some mkdir() do not like a slash at the end
Introduce a compatibility helper for platforms with such a mkdir().
Signed-off-by: Joachim Schmitz <jojo@schmitz-digital.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As suggested by Linus, this function is not checking UTF-8-ness of the
string; it only is seeing if it is pure US-ASCII.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Remove extraneous parentheses and braces;
- Remove redundant NUL-termination before strcpy();
- Check result of unlink when probing for decomposed file names;
- Adjust for the coding style by adding missing whitespaces;
- Move storage class "static" at the beginning of the decl.
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The recent update to terminal I/O interface to get passwords &c
interactively didn't quite work on Solaris.
* bw/maint-1.7.9-solaris-getpass:
Enable HAVE_DEV_TTY for Solaris
terminal: seek when switching between reading and writing
When a stdio stream is opened in update mode (e.g., "w+"),
the C standard forbids switching between reading or writing
without an intervening positioning function. Many
implementations are lenient about this, but Solaris libc
will flush the recently-read contents to the output buffer.
In this instance, that meant writing the non-echoed password
that the user just typed to the terminal.
Fix it by inserting a no-op fseek between the read and
write.
The opposite direction (writing followed by reading) is also
disallowed, but our intervening fflush is an acceptable
positioning function for that alternative.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teaches git to normalize pathnames read from readdir(3) and all
arguments from the command line into precomposed UTF-8 (assuming
that they come as decomposed UTF-8) to work around issues on Mac OS.
I think there still are other places that need conversion
(e.g. paths that are read from stdin for some commands), but this
should be a good first step in the right direction.
* tb/sanitize-decomposed-utf-8-pathname:
git on Mac OS and precomposed unicode
Mac OS X mangles file names containing unicode on file systems HFS+,
VFAT or SAMBA. When a file using unicode code points outside ASCII
is created on a HFS+ drive, the file name is converted into
decomposed unicode and written to disk. No conversion is done if
the file name is already decomposed unicode.
Calling open("\xc3\x84", ...) with a precomposed "Ä" yields the same
result as open("\x41\xcc\x88",...) with a decomposed "Ä".
As a consequence, readdir() returns the file names in decomposed
unicode, even if the user expects precomposed unicode. Unlike on
HFS+, Mac OS X stores files on a VFAT drive (e.g. an USB drive) in
precomposed unicode, but readdir() still returns file names in
decomposed unicode. When a git repository is stored on a network
share using SAMBA, file names are send over the wire and written to
disk on the remote system in precomposed unicode, but Mac OS X
readdir() returns decomposed unicode to be compatible with its
behaviour on HFS+ and VFAT.
The unicode decomposition causes many problems:
- The names "git add" and other commands get from the end user may
often be precomposed form (the decomposed form is not easily input
from the keyboard), but when the commands read from the filesystem
to see what it is going to update the index with already is on the
filesystem, readdir() will give decomposed form, which is different.
- Similarly "git log", "git mv" and all other commands that need to
compare pathnames found on the command line (often but not always
precomposed form; a command line input resulting from globbing may
be in decomposed) with pathnames found in the tree objects (should
be precomposed form to be compatible with other systems and for
consistency in general).
- The same for names stored in the index, which should be
precomposed, that may need to be compared with the names read from
readdir().
NFS mounted from Linux is fully transparent and does not suffer from
the above.
As Mac OS X treats precomposed and decomposed file names as equal,
we can
- wrap readdir() on Mac OS X to return the precomposed form, and
- normalize decomposed form given from the command line also to the
precomposed form,
to ensure that all pathnames used in Git are always in the
precomposed form. This behaviour can be requested by setting
"core.precomposedunicode" configuration variable to true.
The code in compat/precomposed_utf8.c implements basically 4 new
functions: precomposed_utf8_opendir(), precomposed_utf8_readdir(),
precomposed_utf8_closedir() and precompose_argv(). The first three
are to wrap opendir(3), readdir(3), and closedir(3) functions.
The argv[] conversion allows to use the TAB filename completion done
by the shell on command line. It tolerates other tools which use
readdir() to feed decomposed file names into git.
When creating a new git repository with "git init" or "git clone",
"core.precomposedunicode" will be set "false".
The user needs to activate this feature manually. She typically
sets core.precomposedunicode to "true" on HFS and VFAT, or file
systems mounted via SAMBA.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enables threading in index-pack to resolve base data in parallel.
By Nguyễn Thái Ngọc Duy (3) and Ramsay Jones (1)
* nd/threaded-index-pack:
index-pack: disable threading if NO_PREAD is defined
index-pack: support multithreaded delta resolving
index-pack: restructure pack processing into three main functions
compat/win32/pthread.h: Add an pthread_key_delete() implementation
The error handling routines add a newline. Remove
the duplicate ones in error messages.
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When PATH contains an unreadable directory, alias expansion code did not
kick in, and failed with an error that said "git-subcmd" was not found.
By Jeff King (1) and Ramsay Jones (1)
* jk/run-command-eacces:
run-command: treat inaccessible directories as ENOENT
compat/mingw.[ch]: Change return type of exec functions to int
The current t9300-fast-import.sh test number 62 ("L: nested tree
copy does not corrupt deltas") was introduced in commit 9a0edb79
("fast-import: add a test for tree delta base corruption",
15-08-2011). A fix for the demonstrated problem was introduced
by commit 8fb3ad76 ("fast-import: prevent producing bad delta",
15-08-2011). However, this fix didn't work on MinGW and so this
test has always failed on MinGW.
Part of the solution in commit 8fb3ad76 was to add an NO_DELTA
preprocessor constant which was defined as follows:
+/*
+ * We abuse the setuid bit on directories to mean "do not delta".
+ */
+#define NO_DELTA S_ISUID
+
Unfortunately, the S_ISUID constant on MinGW is defined as zero.
In order to fix the problem, we simply alter the definition of
S_ISUID in the mingw header file to a more appropriate value.
Also, we take the opportunity to similarly define S_ISGID and
S_ISVTX.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>