Git on native Windows exclusively uses UTF-8 for console output (both with
mintty and native console windows). Gettext uses setlocale() to determine
the output encoding for translated text, however, MSVCRT's setlocale()
doesn't support UTF-8. As a result, translated text is encoded in system
encoding (GetAPC()), and non-ASCII chars are mangled in console output.
Use gettext's bind_textdomain_codeset() to force the encoding to UTF-8 on
native Windows.
In this developers' setup, HAVE_LIBCHARSET_H is apparently defined, but
we *really* want to override the locale_charset() here.
Signed-off-by: Karsten Blees <blees@dcon.de>
A '+' is not a valid part of a filename with Windows file systems (it is
reserved because the '+' operator meant file concatenation back in the
DOS days).
Let's just not use it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When Git for Windows was still based on MSys1, we had no gettext, ergo
no msgfmt, either. Therefore, we introduced a small and simple Tcl
script to perform the same task.
However, with MSys2, we no longer need that because we have a proper
msgfmt executable. Plus, the po2msg.sh script somehow manages to hang
when run in parallel.
Two reasons to use real msgfmt.exe instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This happens to shut up t7602 on Windows which would otherwise take
the different line endings for a sign that the merge failed.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Along the lines of 05d0e3b and f33946d, use cat instead of echo to avoid
line ending mismatches in the test result of "am empty-file does not
infloop" which make the test fail.
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
HANDLE is defined internally as a void *, but in many cases it is
actually guaranteed to be a 32-bit integer. In these cases, GCC should
not warn about a cast of a pointer to an integer of a different type
because we know exactly what we are doing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the result of a (a, 0) expression is not used, GCC now finds it
necessary to complain with a warning:
right-hand operand of comma expression has no effect
Let's just pretend to use the 0 value and have a peaceful and quiet life
again.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The environment is modified in most surprising circumstances, and not
all of them are under Git's control. For example, calling
curl_global_init() on Windows will ensure that the CHARSET variable is
set, adding one if necessary.
While the previous commit worked around crashes triggered by such
outside changes of the environment by relaxing the requirement that the
environment be terminated by a NULL pointer, the other assumption made
by `mingw_getenv()` and `mingw_putenv()` is that the environment is
sorted, for efficient lookup via binary search.
Let's make real sure that our environment is intact before querying or
modifying it, and reinitialize our idea of the environment if necessary.
With this commit, before working on the environment we look briefly for
indicators that the environment was modified outside of our control, and
to ensure that it is terminated with a NULL pointer and sorted again in
that case.
Note: the indicators are maybe not sufficient. For example, when a
variable is removed, it will not be noticed. It might also be a problem
if outside changes to the environment result in a modified `environ`
pointer: it is unclear whether such a modification could result in a
problem when `mingw_putenv()` needs to `realloc()` the environment
buffer.
For the moment, however, the current fix works well enough, so let's
only face the potential problems when (and if!) they occur.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Outside of our Windows-specific code, the end of the environment can be
marked also by a pointer to a NUL character, not only by a NULL pointer
as our code assumed so far.
That led to a buffer overrun in `make_environment_block()` when running
`git-remote-https` in `mintty` (because `curl_global_init()` added the
`CHARSET` environment variable *outside* of `mingw_putenv()`, ending the
environment in a pointer to an empty string).
Side note for future debugging on Windows: when running programs in
`mintty`, the standard input/output/error is not connected to a Win32
Console, but instead is pipe()d. That means that even stderr may not be
written completely before a crash, but has to be fflush()ed explicitly.
For example, when debugging crashes, the developer should insert an
`fflush(stderr);` at the end of the `error()` function defined in
usage.c.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It is unlikely that we have an empty environment, ever, but *if* we do,
when `environ_size - 1` is passed to `bsearchenv()` it is misinterpreted
as a real large integer.
To make the code truly defensive, refuse to do anything at all if the
size is negative (which should not happen, of course).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As suggested privately to Brendan Forster by some unnamed person
(suggestion for the future: use the public mailing list, or even the
public GitHub issue tracker, that is a much better place to offer such
suggestions), we should make use of gcc's stack smashing protector that
helps detect stack buffer overruns early.
Rather than using -fstack-protector, we use -fstack-protector-strong
because it strikes a better balance between how much code is affected
and the performance impact.
In a local test (time git log --grep=is -p), best of 5 timings went from
23.009s to 22.997s (i.e. the performance impact was *well* lost in the
noise).
This fixes https://github.com/git-for-windows/git/issues/501
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
MSys2 emulates pseudo terminals via named pipes, and isatty() returns 0
for such file descriptors. Therefore, some interactive functionality (such
as launching a pager, asking if a failed unlink should be repeated etc.)
doesn't work when run in a terminal emulator that uses MSys ptys (such as
mintty).
However, MSys uses special names for its pty pipes ('msys-*-pty*'), which
allows us to distinguish them from normal piped input / output.
On startup, check if stdin / stdout / stderr are connected to such pipes
using the NtQueryObject API from NTDll.dll. If the names match, adjust the
flags in MSVCRT's ioinfo structure accordingly.
Signed-off-by: Karsten Blees <blees@dcon.de>
On Windows >= Vista, not having an application manifest with a
requestedExecutionLevel can cause several kinds of confusing behavior.
The first and more obvious behavior is "Installer Detection", where
Windows sometimes decides (by looking at things like the file name and
even sequences of bytes within the executable) that an executable is an
installer and should run elevated (causing the well-known popup dialog
to appear). In Git's context, subcommands such as "git patch-id" or "git
update-index" fall prey to this behavior.
The second and more confusing behavior is "File Virtualization". It
means that when files are written without having write permission, it
does not fail (as expected), but they are instead redirected to
somewhere else. When the files are read, the original contents are
returned, though, not the ones that were just written somewhere else.
Even more confusing, not all write accesses are redirected; Trying to
write to write-protected .exe files, for example, will fail instead of
redirecting.
In addition to being unwanted behavior, File Virtualization causes
dramatic slowdowns in Git (see for instance
http://code.google.com/p/msysgit/issues/detail?id=320).
There are two ways to prevent those two behaviors: Either you embed an
application manifest within all your executables, or you add an external
manifest (a file with the same name followed by .manifest) to all your
executables. Since Git's builtins are hardlinked (or copied), it is
simpler and more robust to embed a manifest.
A recent enough MSVC compiler should already embed a working internal
manifest, but for MinGW you have to do so by hand.
Very lightly tested on Wine, where like on Windows XP it should not make
any difference.
References:
- New UAC Technologies for Windows Vista
http://msdn.microsoft.com/en-us/library/bb756960.aspx
- Create and Embed an Application Manifest (UAC)
http://msdn.microsoft.com/en-us/library/bb756929.aspx
[js: simplified the embedding dramatically by reusing Git for Windows'
existing Windows resource file, removed the optional (and dubious)
processorArchitecture attribute of the manifest's assemblyIdentity
section.]
Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the rename function tries to move a directory it fails if the target
directory exists. It should check if it can delete the (possibly empty)
target directory and then try again to move the directory.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: 마누엘 <nalla@users.noreply.github.com>
With MSys2, the sigset_t type is defined in sys/types.h, therefore we
need to #include said file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This test assumed that there are no two equivalent directory separators.
However, on Windows, the back slash and the forward slash *are*
equivalent. Let's paper over this issue by converting the backward
slashes to forward ones in the test that fails with MSys2 otherwise.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When shell scripts access a $TMP variable containing backslashes, they
will be mistaken for escape characters. Let's not let that happen by
converting them to forward slashes.
This fixes t7800 with MSys2.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
There is a really useful debugging technique developed by Sverre
Rabbelier that inserts "bash &&" somewhere in the test scripts, letting
the developer interact at given points with the current state.
Another debugging technique, used a lot by this here coder, is to run
certain executables via gdb by guarding a "gdb -args" call in
bin-wrappers/git.
Both techniques were disabled by 781f76b1(test-lib: redirect stdin of
tests).
Let's reinstate the ability to run an interactive shell by making the
redirection optional: setting the TEST_NO_REDIRECT environment variable
will skip the redirection.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This test is susceptible to MSys2's posix-to-windows path mangling; Let's
just use POSIX paths throughout and let the tests pass.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
With MSys2, there is actually an implementation of mkfifo available. The
only problem is that it is only emulating named pipes through the MSys2
runtime; The Win32 API has no idea about named pipes, hence the Git
executable cannot access those pipes either.
The symptom is that Git fails with a '<name>: No such file or directory'
because MSys2 emulates named pipes through special-crafted '.lnk' files.
The solution is to tell the test suite explicitly that we cannot use
named pipes when we want to test a MinGW Git.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
MSys2 actually allows to create files or directories whose names contain
tabs, newlines or colors, even if plain Win32 API cannot access them.
As we are using an MSys2 bash to run the tests, such files or
directories are created successfully, but Git has no chance to work with
them because it is a regular Windows program, hence limited by the Win32
API.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
With MSys2, the "TZ" environment variable gets filtered out when calling
non-MSys2 executables. The reason is that Windows' time zone handling is
substantially different from the POSIX one.
However, we just taught Git for Windows' fork of the MSys2 runtime to
pass on the timezone in a different environment variable, MSYS2_TZ for
the sole purpose of Git being able to reinterpret it correctly.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The unsetenv code has no idea to update our environ_size, therefore
causing segmentation faults when environment variables are removed
without compat/mingw.c's knowing (MinGW's optimized lookup would try
to strcmp() against NULL in such a case).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It does not quite work because it produces DOS line endings which the
shell does not like at all.
This lets t3406, t3903, t4254, t7400, t7401, t7406 and t7407 pass.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This just makes things compile, the test suite needs extra tender loving
care in addition to this change. We will address these issues in later
commits.
While at it, also allow building MSys2 Git (i.e. a Git that uses MSys2's
POSIX emulation layer).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git for Windows lags a little bit behind with the 2.x releases because
the Git for Windows developers wanted to let that big jump coincide with
a well-needed overhaul of the context within which Git for Windows is
developed.
To understand why this is such a big issue, it needs to be noted that
many parts of Git are not written in portable C, but instead relies on a
POSIX shell and Perl to be available. Even in the portable C part, there
is the ingrained notion that we can work with UTF-8 encoded strings.
To support the scripts, Git for Windows has to ship a minimal POSIX
emulation layer with Bash and Perl thrown in, and when the Git for
Windows effort started originally, in August 2007, we settled on using
MSys, a stripped down version of Cygwin. Consequently, the original name
of the project was "msysGit" (which, sadly, caused a *lot* of confusion
because few Windows users know about MSys, and even less care).
To compile the C code of Git for Windows, we used MSys, too: it sports
an additional version of the GNU C Compiler that targets the plain
Win32 API (with a few convenience functions thrown in) instead of the
POSIX emulation layer that would require the MSys runtime to run the
compiled programs. That way, Git for Windows' executable(s) are really
just Win32 programs. To discern executables requiring the POSIX
emulation layer from the ones that do not, the latter are called MinGW
(Minimal GNU for Windows) when the former are called MSys executables.
This reliance on MSys incurred challenges, too, though: some of our
changes to the MSys runtime -- necessary to support Git for Windows
better -- were not accepted upstream, the MSys runtime was not developed
further to support e.g. UTF-8 or 64-bit, and apart from not having a
package management system until much later (when mingw-get was
introduced), many packages provided by the MSys/MinGW project lag behind
the respective source code versions, in particular Bash and OpenSSL. For
a while, the Git for Windows project tried to remedy the situation by
trying to build newer versions of those packages, but the situation
quickly became untenable, especially with problems like the Heartbleed
bug requiring swift action and Git for Windows contributors being scarce
-- despite millions of downloads suggesting that there are many users.
After a brief push in the direction of mingw-get, thanks to the
long-time contributor and co-maintainer Sebastian Schuberth, it became
clear that we need to look for alternatives.
Happily, in the meantime the MSys2 project (https://msys2.github.io/)
emerged, and was chosen to be the base of the Git for Windows 2.x. MSys2
is a rewrite of the spirit of MSys: it is again a stripped down version
of Cygwin, but it is actively kept up-to-date with Cygwin's source code.
Thereby, it already supports Unicode internally, and it also offers the
64-bit support that we yearned for since the beginning of the Git for
Windows project.
MSys2 also ported the Pacman package management system from Arch Linux
and uses it heavily. This brings the same convenience to which Linux
users are used to from `yum` or `apt-get`, and to which MacOSX users are
used to from Homebrew or MacPorts, or BSD users from the Ports system,
to MSys2: a simple `pacman -Syu` will update all installed packages to
the newest versions currently available.
MSys2 is also *very* active, typically providing package updates
multiple times per week.
It still required a two-month effort to bring everything to a state
where Git's test suite passes, and a couple of patches await their
submission to the respective upstream projects. Yet without MSys2, the
modernization of Git for Windows would simply not have happened.
This commit lays the ground work to supporting MSys2-based Git builds.
Assisted-by: Waldek Maleska <weakcamel@users.github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
squash! config.mak.uname: support MSys2
Make sure that the nedmalloc patch is applied before.
With MSys2's GCC, `ReadWriteBarrier` is already defined, and FORCEINLINE
unfortunately gets defined incorrectly.
Let's work around both problems, using the MSys2-specific
__MINGW64_VERSION_MAJOR constant to guard them.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The excellent MSys2 project brings a substantially updated MinGW
environment including newer GCC versions and new headers. To support
compiling Git, let's special-case the new MinGW (tell-tale: the
_MINGW64_VERSION_MAJOR constant is defined).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
MSys2's strace facility is very useful for debugging... With this patch,
the bash will be executed through strace if the environment variable
GIT_STRACE_COMMANDS is set, which comes in real handy when investigating
issues in the test suite.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This way the libraries get properly installed into the "site_perl"
directory and we just have to move them out of the "mingw" directory.
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
This is a companion patch to bf9acba (http: treat config options
sslCAPath and sslCAInfo as paths, 2015-11-23).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).
Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.
Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).
Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).
Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).
Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.
Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.
While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.
Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199
[jes: adjusted test number to avoid conflicts, reinstated && chain]
Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).
Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.
Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).
Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).
Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).
Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.
Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.
While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.
Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199
[jes: adjusted test number to avoid conflicts]
Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jes: adusted test number to avoid conflicts, fixed non-portable use of
the 'export' statement]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.
On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.
Signed-off-by: Karsten Blees <blees@dcon.de>
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.
Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.
Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.
The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.
Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.
Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Signed-off-by: Karsten Blees <blees@dcon.de>
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.
Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.
Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).
Signed-off-by: Karsten Blees <blees@dcon.de>