I played around with this quite a bit. After trying some more complex
schemes, I found that what worked best is to just sleep 1 millisecond
between iterations. Though it's a very short time, it still completely
eliminates the busy wait condition, without hurting perf.
There code uses SleepEx(1, TRUE) to sleep. See this page for a good
discussion of why that is better than calling SwitchToThread, which
is what was used previously:
http://stackoverflow.com/questions/1383943/switchtothread-vs-sleep1
Note that calling SleepEx(0, TRUE) does *not* solve the busy wait.
The most striking case was when testing on a UNC share with a large repo,
on a single CPU machine. Without the fix, it took 4 minutes 15 seconds,
and with the fix it took just 1:08! I think it's because git-upload-pack's
busy wait was eating the CPU away from the git process that's doing the
real work. With multi-proc, the timing is not much different, but tons of
CPU time is still wasted, which can be a killer on a server that needs to
do bunch of other things.
I also tested the very fast local case, and didn't see any measurable
difference. On a big repo with 4500 files, the upload-pack took about 2
seconds with and without the fix.
Apparently the signal handling is not quite correct in the fsckobject
handling (most likely we rely on a side effect that lets us still output
some message after receiving a signal 13 but in the BuildHive setup this
fails intermittently).
As a consequence, the push in t5504 does fail as expected, but fails to
output anything (unexpected). Since this is good enough for now, let's
handle an empty output as success, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
After importing anything with fast-import, we should always let the
garbage collector do its job, since the objects are written to disk
inefficiently.
This brings down an initial import of http://selenic.com/hg from about
230 megabytes to about 14.
In the future, we may want to make this configurable on a per-remote
basis, or maybe teach fast-import about it in the first place.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Matt Mackall introduced a revs() method to the localrepo class on Wed
Nov 2 13:37:34 2011 in the commit 'localrepo: add revs helper method'.
It is used when constructing a commit in memory.
If we store the set of revs we want to handle under the same name, it
overrides that method, resulting in an unpleasant 'TypeError: 'set'
object is not callable' whenever we want to push (as we are constructing
commits in memory, then).
So let's work around that by renaming our field to 'revs_' and hope that
upstream Mercurial does not introduce a field of that name, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In this case: David Soria Parra <dsp <at> php.net>.
With this last of three Postel patches, remote-hg can import the
Mercurial repository completely.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This change allows invalid input from Mercurial repositories where the
author is recorded as 'Name <email@blah' (missing the closing '>').
With this change, importing http://scelenic.com/hg itself no longer fails
with:
fatal: Missing > in ident string: Benoit Boissinot
<benoit.boissinot@ens-lyon.org <none@none> 1129685868 -0700
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We should handle a missing space before the email part of an author ident
gracefully. See for example the icedtea6 repository.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Facilitate writing import-export based helpers in python by
refactoring common code to a base class.
[jes: rebased to newer upstream Git]
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
This works now that fast-export has been fixed to properly handle
refs that point to a commit that was not exported during the current
fast-export run.
This will be required by fast-export, when no commits were
exported, but the refs should be set, of course.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
When calling `git fast-export master..next` we want to export
refs/heads/next, but not refs/heads/master.
Currently this is not a problem, because negative refs' commits
are never shown. In the next commit this will be changed in order
to make sure that 'master..master' does export master. I.e. even
refs whose commits are not shown are exported.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
This will be required by fast-export, when no commits were
exported, but the refs should be set, of course.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
When calling `git fast-export a..a b` when a and b refer to the same
commit, nothing would be exported, and an incorrect reset line would
be printed for b ('from :0').
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
This seems to be related to the poll-emulation... I see that these things
are guarded by an "#if(_WIN32_WINNT >= 0x0600)" in <winsock2.h>, which
means it's supported for Windows Vista and above... We still support
Windows XP, so it seems someone has set this too high :)
I'd prefer to set this from the Makefile, but this generates a warning in
compat/win32/poll.c about redefining a macro (poll.c wants it to be 0x502,
which is Windows XP with SP2, rather than 0x501 which is normal Windows
XP).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Dynamic linking is generally preferred over static linking, and MSVCRT.dll
has been integral part of Windows for a long time.
This also fixes linker warnings for _malloc and _free in zlib.lib, which
seems to be compiled for MSVCRT.dll already.
The DLL version also exports some of the CRT initialization functions,
which are hidden in the static libcmt.lib (e.g. __wgetmainargs, required by
subsequent Unicode patches).
Signed-off-by: Karsten Blees <blees@dcon.de>
As of "Win32: Thread-safe windows console output", git-log no longer
terminates when the pager process dies. This is due to disabling buffering
for the replaced stdout / stderr streams. Git-log will periodically fflush
stdout (see write_or_die.c/mayble_flush_or_die()), but with no buffering,
this is a NOP that always succeeds (so we never detect the EPIPE error).
Exchange the original console handles with our console thread pipe handles
by accessing the internal MSVCRT data structures directly (which are
exposed via __pioinfo for some reason).
Implement this with minimal assumptions about the actual data structure to
make it work with different (hopefully even future) MSVCRT versions.
While messing with internal data structures is ugly, this patch solves the
problem at the source instead of adding more workarounds. We no longer need
the special winansi_isatty override, and the limitations documented in
"Win32: Thread-safe windows console output" are gone (i.e. fdopen(1/2)
returns unbuffered streams now, and isatty() for duped console file
descriptors works as expected).
Signed-off-by: Karsten Blees <blees@dcon.de>
On Windows XP (not Win7), directories cannot be deleted while a find handle
is open, causing "Deletion of directory '...' failed. Should I try again?"
prompts.
Prior to 19d1e75d "Win32: Unicode file name support (except dirent)",
these failures were silently ignored due to strbuf_free in is_dir_empty
resetting GetLastError to ERROR_SUCCESS.
Close the find handle in is_dir_empty so that git doesn't block deletion
of the directory even after all other applications have released it.
Reported-by: John Chen <john0312@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Fix Windows specific environment settings on startup rather than checking
for special values on every getenv call.
As a side effect, this makes the patched environment (i.e. with properly
initialized TMPDIR and TERM) available to child processes.
Signed-off-by: Karsten Blees <blees@dcon.de>
The Windows environment is sorted, keep it that way for O(log n)
environment access.
Change compareenv to compare only the keys, so that it can be used to
find an entry irrespective of the value.
Change lookupenv to binary seach for an entry. Return one's complement of
the insert position if not found (libc's bsearch returns NULL).
Replace MSVCRT's getenv with a minimal do_getenv based on the binary search
function.
Change do_putenv to insert new entries at the correct position. Simplify
the function by swapping if conditions and using memmove instead of for
loops.
Move qsort from make_environment_block to mingw_startup. We still need to
sort on startup to make sure that the environment is sorted according to
our compareenv function (while Win32 / CreateProcess requires the
environment block to be sorted case-insensitively, CreateProcess currently
doesn't enforce this, and some applications such as bash just don't care).
Note that environment functions are _not_ thread-safe and are not required
to be so by POSIX, the application is responsible for synchronizing access
to the environment. MSVCRT's getenv and our new getenv implementation are
better than that in that they are thread-safe with respect to other getenv
calls as long as the environment is not modified. Git's indiscriminate use
of getenv in background threads currently requires this property.
Signed-off-by: Karsten Blees <blees@dcon.de>
As of d41489a6 "Add more large blob test cases", git's high-level memory
allocation functions (xmalloc, xmemdupz etc.) access the environment to
simulate limited memory in tests (see 'getenv("GIT_ALLOC_LIMIT")' in
memory_limit_check()). These functions should not be used before the
environment is fully initialized (particularly not to initialize the
environment itself).
The current solution ('environ = NULL; ALLOC_GROW(environ...)') only works
because MSVCRT's getenv() reinitializes environ when it is NULL (i.e. it
leaves us with two sets of unusabe (non-UTF-8) and unfreeable (CRT-
allocated) environments).
Add our own set of malloc-or-die functions to be used in startup code.
Also check the result of __wgetmainargs, which may fail if there's not
enough memory for wide-char arguments and environment.
This patch is in preparation of the sorted environment feature, which
completely replaces MSVCRT's getenv() implementation.
Signed-off-by: Karsten Blees <blees@dcon.de>
Move environment array reallocation from do_putenv to the respective
callers. Keep track of the environment size in a global variable. Use
ALLOC_GROW in mingw_putenv to reduce reallocations. Allocate a
sufficiently sized environment array in make_environment_block to prevent
reallocations.
Signed-off-by: Karsten Blees <blees@dcon.de>
When spawning child processes via start_command(), the environment and all
environment entries are copied twice. First by make_augmented_environ /
copy_environ to merge with child_process.env. Then a second time by
make_environment_block to create a sorted environment block string as
required by CreateProcess.
Move the merge logic to make_environment_block so that we only need to copy
the environment once. This changes semantics of the env parameter: it now
expects a delta (such as child_process.env) rather than a full environment.
This is not a problem as the parameter is only used by start_command()
(all other callers previously passed char **environ, and now pass NULL).
The merge logic no longer xstrdup()s the environment strings, so do_putenv
must not free them. Add a parameter to distinguish this from normal putenv.
Remove the now unused make_augmented_environ / free_environ API.
Signed-off-by: Karsten Blees <blees@dcon.de>
Environment helper functions use random naming ('env' prefix or suffix or
both, with or without '_'). Change to POSIX naming scheme ('env' suffix,
no '_').
Env_setenv has more in common with putenv than setenv. Change to do_putenv.
Signed-off-by: Karsten Blees <blees@dcon.de>
Move environment helper functions up so that they can be reused by
mingw_getenv and mingw_spawnve_fd in subsequent patches.
Signed-off-by: Karsten Blees <blees@dcon.de>
The only public spawn function that needs to tweak the environment is
mingw_spawnvpe (called from start_command). Nevertheless, all internal
spawn* functions take an env parameter and needlessly pass the global
char **environ around. Remove the env parameter where it's not needed.
This removes the internal mingw_execve abstraction, which is no longer
needed.
Signed-off-by: Karsten Blees <blees@dcon.de>
The environment on Windows is case-insensitive. Some environment functions
(such as unsetenv and make_augmented_environ) have always used case-
sensitive comparisons instead, while others (getenv, putenv, sorting in
spawn*) were case-insensitive.
Prevent potential inconsistencies by using case-insensitive comparison in
lookup_env (used by putenv, unsetenv and make_augmented_environ).
Signed-off-by: Karsten Blees <blees@dcon.de>
All functions that modify the environment have memory leaks.
Disable gitunsetenv in the Makefile and use env_setenv (via mingw_putenv)
instead (this frees removed environment entries).
Move xstrdup from env_setenv to make_augmented_environ, so that
mingw_putenv no longer copies the environment entries (according to POSIX
[1], "the string [...] shall become part of the environment"). This also
fixes the memory leak in gitsetenv, which expects a POSIX compliant putenv.
[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/putenv.html
Note: This patch depends on taking control of char **environ and having
our own mingw_putenv (both introduced in "Win32: Unicode environment
(incoming)").
Signed-off-by: Karsten Blees <blees@dcon.de>
On Windows, all native APIs are Unicode-based. It is impossible to pass
legacy encoded byte arrays to a process via command line or environment
variables. Disable the tests that try to do so.
In t3901, most tests still work if we don't mess up the repository encoding
in setup, so don't switch to ISO-8859-1 on MinGW.
Note that i18n tests that do their encoding tricks via encoded files (such
as t3900) are not affected by this.
Signed-off-by: Karsten Blees <blees@dcon.de>
Convert environment from UTF-16 to UTF-8 on startup.
No changes to getenv() are necessary, as the MSVCRT version is implemented
on top of char **environ.
However, putenv / _wputenv from MSVCRT no longer work, for two reasons:
1. they try to keep environ, _wenviron and the Win32 process environment
in sync, using the default system encoding instead of UTF-8 to convert
between charsets
2. msysgit and MSVCRT use different allocators, memory allocated in git
cannot be freed by the CRT and vice versa
Implement mingw_putenv using the env_setenv helper function from the
environment merge code.
Note that in case of memory allocation failure, putenv now dies with error
message (due to xrealloc) instead of failing with ENOMEM. As git assumes
setenv / putenv to always succeed, this prevents it from continuing with
incorrect settings.
Signed-off-by: Karsten Blees <blees@dcon.de>
Use the same Unicode conversion functions for file names and console
conversions so that the file system and console output are in sync when
checking out legacy encoded repositories (i.e. with invalid UTF-8 file
names).
Signed-off-by: Karsten Blees <blees@dcon.de>