By doing so the only external user of the path handling and functions
is removed, and these functions can be made static.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
We want to make them static later, and we need them in the proper order
for this. There is otherwise no code change.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
gethostbyname() is the first function that calls into the Winsock library,
and it is wrapped only to initialize the library.
socket() is wrapped for two reasons:
- Windows's socket() creates things that are like low-level file handles,
and they must be converted into file descriptors first.
- And these handles cannot be used with plain ReadFile()/WriteFile()
because they are opened for "overlapped IO". We have to use WSASocket()
to create non-overlapped IO sockets.
connect() must be wrapped because Windows's connect() expects the low-level
sockets, not file descriptors, and we must first unwrap the file descriptor
before we can pass it on to Windows's connect().
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
There were some references to the progress indicator, where this
implementation originally appeared.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
This emulation of poll() is by far not general. It assumes that the
fds that are to be waited for are connected to pipes. The pipes are
polled in a loop until data becomes available in at least one of them.
If only a single fd is waited for, the implementation actually does
not wait at all, but assumes that a subsequent read() will block.
In order to not burn CPU time, it is yielded to other processes before
the next round in the poll loop using Sleep(0). Note that any sleep
timeout greater than zero will reduce the efficiency by a magnitude.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
We want to get rid of spawn-pipe.*, but these functions will be needed.
On the way, the function signature was changed to avoid warnings about
incompatible pointer types when the argument is the global variable
"environ".
The wrapper does two things:
- Requests to open /dev/null are redirected to open the nul pseudo file.
- A request to open a file that currently exists as a directory, then
Windows's open fails with EACCES; this is changed to EISDIR.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Quite a lot of stuff has accumulated or is now obsolete. The stubs of
POSIX functions that are not implemented or that always fail are now
implemented as inline functions so that they exist in only one place.
Windows's struct stat does not have a st_blocks member. Since we already
have our own stat/lstat/fstat implementations, we can just as well use
a customized struct stat. This patch introduces just that, and also fills
in the st_blocks member. On the other hand, we don't provide members that
are never used.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
getpwuid() is kept as simple as possible so that no errors are generated.
Since the information that it returns is not very useful, users are still
required to set up user.name and user.email configuration.
All uses of getpwuid() are like getpwuid(getuid()), hence, the return value
of getpwuid() is irrelevant. getpwnam() is only used to resolve '~' and
'~username' paths, which is an idiom not known on Windows, hence, we
don't implement it, either.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Since these functions are MinGW-specific, they better belong into this
compatibility file. They will be needed there in a follow-up change that
reimplements execvp().
MS Windows command line is handled in a weird way. This patch addresses:
- Quote empty arguments
- Only escape backslashes and double quotation marks inside quoted arguments
- Quote arguments if they have asterisk or question marks to prevent expansion
The last one is not documented in the link provided in the patch. I encountered
that behavior on cmd.exe, Windows XP. MSYS not tested.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Windows's vsnprintf() receives the number of characters to write, which
does not include the trailing NUL byte. But our vsnprintf() users pass
the available space, including the trailing NUL.
On Windows, vsnprintf returns -1 if the buffer is too small instead of
the number of characters needed. This wrapper computes the needed buffer
size by trying various sizes with exponential growth. A large growth
factor is used so as only few trials are required if a really large
result needs to be stored.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
If an external git command (not a shell script) was invoked with arguments
that contain spaces, these arguments would be split into separate
arguments. They must be quoted. This also affected installations where
$prefix contained a space, as in "C:\Program Files\GIT". Both errors can
be triggered by invoking
git hash-object "a b"
where "a b" is an existing file.
It turns out that GetFileInformationByHandle() succeeds even for pipes
and sockets. Hence, we fall back to Windows's own fstat() implementation
for everything except files. This also takes care of any error codes
(again, except for files - but we don't expect any errors here).
A file name that contains a colon will be rejected by GeFileInformation()
with ERROR_INVALID_NAME. This must be treated as ENOENT. Such a file name
ends up in do_lstat() when the rev:path notation is used (eg. in
'git show').
GetFileInformationByHandle() fails if it is passed a WinSock handle.
Fortunately, the failure can be distinguished by the error code, and we
can in this case pretend that the fstat() was actually successful.
This is a valid thing to do: Calling fstat() on a descriptor makes only
sense if either the caller needs information on the file (in which case
we would not reach this error condition), or if it wants to distinguish
a socket from a file (which implies that the caller will have to test
st_mode, which happens to be the only field that we can fill in).
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
This gives us a significant speedup when adding, committing and stat'ing files.
Also, since Windows doesn't really handle symlinks, we let stat just uses lstat.
We also need to replace fstat, since our implementation and the standard stat()
functions report slightly different timestamps, possibly due to timezones.
We simply report UTC in our implementation, and do our FILETIME to time_t
conversion based on the document at http://support.microsoft.com/kb/167296.
With Moe's repo structure (100K files in 100 dirs, containing 2-4 bytes)
mkdir bummer && cd bummer; for ((i=0;i<100;i++)); do
mkdir $i && pushd $i;
for ((j=0;j<1000;j++)); do echo "$j" >$j; done;
popd;
done
We get the following performance boost:
With normal lstat & stat Custom lstat/fstat
------------------------ ------------------------
Command: git init Command: git init
------------------------ ------------------------
real 0m 0.047s real 0m 0.063s
user 0m 0.031s user 0m 0.015s
sys 0m 0.000s sys 0m 0.015s
------------------------ ------------------------
Command: git add . Command: git add .
------------------------ ------------------------
real 0m19.390s real 0m12.031s 1.6x
user 0m 0.015s user 0m 0.031s
sys 0m 0.030s sys 0m 0.000s
------------------------ ------------------------
Command: git commit -a.. Command: git commit -a..
------------------------ ------------------------
real 0m30.812s real 0m16.875s 1.8x
user 0m 0.015s user 0m 0.015s
sys 0m 0.000s sys 0m 0.015s
------------------------ ------------------------
3x Command: git-status 3x Command: git-status
------------------------ ------------------------
real 0m11.860s real 0m 5.266s 2.2x
user 0m 0.015s user 0m 0.015s
sys 0m 0.015s sys 0m 0.015s
real 0m11.703s real 0m 5.234s
user 0m 0.015s user 0m 0.015s
sys 0m 0.000s sys 0m 0.000s
real 0m11.672s real 0m 5.250s
user 0m 0.031s user 0m 0.015s
sys 0m 0.000s sys 0m 0.000s
------------------------ ------------------------
Command: git commit... Command: git commit...
(single file) (single file)
------------------------ ------------------------
real 0m14.234s real 0m 7.735s 1.8x
user 0m 0.015s user 0m 0.031s
sys 0m 0.000s sys 0m 0.000s
Signed-off-by: Marius Storm-Olsen <mstormo_git@storm-olsen.com>
Windows's rename() is based on the MoveFile() API, which fails if the
destination exists. Here we work around the problem by using MoveFileEx().
Furthermore, the posixly correct error is returned if the destination is
a directory.
The implementation is still slightly incomplete, however, because of the
missing error code translation: We assume that the failure is due to
permissions.
lstat() is sometimes invoked with a path that ends in a slash (in
particular, when dealing with subprojects). Windows's stat() does not
accept such paths and fails with ENOENT. In this case we try again
with a cleaned-up path.
When the argument vector for the interpreter invocation is assembled,
the original arguments were already quoted when necessary, but the
script name was not. If the script lives in a directory whose names
contains spaces, the interpreter would not find the script.
It commonly happens that git-fetch-pack and git-upload-pack hit a deadlock
in the initial commit id exchange, such that both try to write to the
other end, but do not succeed. I have the suspicion that the reason is
that both ends fill the pipe, but don't read.
Increasing the pipe buffer helps, but is this the real cure?
Earlier we would have run all scripts under 'sh', but only changed
the name (argv[0]) to the parsed interpreter.
While we are here, also ignore command line options specified in
the interpreter line; perl's -w is the common case.
We have been lucky in the past that the missing argument was taken from
whatever random value was on the stack and it was still a somewhat
useful umask, but we should really specify 0600 there.
As it turns out, the things returned by Winsock2's socket() are handles
that can be passed to ReadFile()/WriteFile() - almost. The way this works
is by wrapping those handles into file descriptors with _open_osfhandle().
But it turns out that the sockets created by the plain socket() function
are prepared for "overlapped" I/O, which confuses ReadFile()/WriteFile().
Therefore, a reimplementation is provided that uses WSASocket() to
explicitly asks for non-overlapped sockets.
Special thanks got to H. Peter Anvin, who provided the necessary clues.
strptime() is only used in convert-objects.c, but we do not build that one
(for reasons I do not recall anymore). That tool should be unnecessary
anyway.
Windows's _pipe() by default allocates inheritable pipes. However,
when a spawn happens, we do not have a possiblility to close the unused
pipe ends in the child process. This is a problem.
Consider the following situation: The child process only reads from the
pipe and the parent process uses only the writable end; the parent even
closes the writable end. As it happens, the child at this time usually
still waits for input in a read(). But since the child has inherited
an open writable end, it does not get EOF and hangs ad infinitum.
For this reason, pipe handles must not be inheritable. At the first
glance, this is curious, since after all it is the purpose of pipes to be
inherited by child processes. However, in all cases where this
inheritance is needed for a file descriptor, it is dup2()'d to stdin or
stdout anyway, and, lo and behold, Windows's dup2() creates inheritable
duplicates.
Windows does not have fork(), but something called spawn() that is roughly
equivalent to a fork()/exec() pair, factor out the Unix style code into
a function that does it more similarly to spawn(). Now the Windows style
spawn() can more easily be employed to achieve the same that the Unix style
code does.
When an external git command is invoked, it can be a Bourne shell script.
This patch looks into the command file to see whether it is one.
In this case, the command line is rearranged to invoke the shell
with the proper arguments.
Moreover, the arguments are quoted if necessary because Windows'
spawn functions paste the arguments again into a command line that
is disassembled by the invoked process.
An earlier patch has implemented getcwd() so that it converts the
drive letter into the POSIX-like path that is used internally by
MinGW (C:\foo => /c/foo), but this style does not work outside
the MinGW shell. It is better to just convert the backslashes
to forward slashes and handle the drive letter explicitly.