After importing anything with fast-import, we should always let the
garbage collector do its job, since the objects are written to disk
inefficiently.
This brings down an initial import of http://selenic.com/hg from about
230 megabytes to about 14.
In the future, we may want to make this configurable on a per-remote
basis, or maybe teach fast-import about it in the first place.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It appears that `pwd` returns the POSIX-style or the DOS-style path
depending which style the previous `cd` used. To normalize, enforce `pwd
-W` in scripts.
From the original e-mail exchange:
On Thu, Mar 22, 2012 at 11:13:37AM +0100, Sebastian Schuberth wrote:
> On Wed, Mar 21, 2012 at 22:21, Johannes Sixt <j6t@kdbg.org> wrote:
>
> > I build git and run its tests outside the msysgit environment. Does that
> > explain the difference? (And I use CMD.)
>
> It does not make a difference for me. I started cmd.exe at
> c:\msysgit\git\t, added c:\msysgit\bin temporarily to PATH, and ran
> "sh t5526-fetch-submodules.sh -i -v", and the test still fails.
Yes it probably does. Johannes said that he runs the tests outside of
the msysgit folder. That way there is only one path the submodule script
gets reported and not two like '/c/msysgit/git' and '/git'.
That would explain to me why it is passing.
I am afraid that the only solution is to patch msys itself to report the
long absolute path when passing window style paths to cd. Currently when
I do
cd c:/msysgit/git
I will end up in '/git' instead of the long path.
I found that there is a -W option to pwd in msys bash which makes it
always return the real windows path. A normalization in that direction
is unique and thus might be more robust. Have a look at the attached
patch. With this at least t5526 passes. I was not able to run the whole
testsuite properly at the moment. I can have a look at that tomorrow.
What do you think?
Cheers Heiko
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Matt Mackall introduced a revs() method to the localrepo class on Wed
Nov 2 13:37:34 2011 in the commit 'localrepo: add revs helper method'.
It is used when constructing a commit in memory.
If we store the set of revs we want to handle under the same name, it
overrides that method, resulting in an unpleasant 'TypeError: 'set'
object is not callable' whenever we want to push (as we are constructing
commits in memory, then).
So let's work around that by renaming our field to 'revs2' and hope that
upstream Mercurial does not introduce a field of that name, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
For now, remote-hg cannot be used for pushing. The respective tests fail
thusly:
warning: non-alnum alias 'remote:///git/t/trash directory.t5801-remote-hg/empty'
transaction abort!
rollback completed
Traceback (most recent call last):
File "/git/git-remote-hg", line 101, in <module>
sys.exit(HgRemoteHelper().main(sys.argv))
File ".../lib/git_remote_helpers/helper.py", line 197, in main
more = self.read_one_line(repo)
File ".../lib/git_remote_helpers/helper.py", line 163, in read_one_line
func(repo, cmdline)
File ".../lib/git_remote_helpers/helper.py", line 121, in do_export
localrepo.importer.do_import(localrepo.gitdir)
File ".../lib/git_remote_helpers/hg/importer.py", line 27, in do_import
processor.parseMany(sources, parser.ImportParser, procc)
File ".../lib/git_remote_helpers/fastimport/processor.py", line 219,
in parseMany
processor.process(parser.parse())
File ".../lib/git_remote_helpers/fastimport/processor.py", line 76,
in process
handler(self, cmd)
File ".../lib/git_remote_helpers/hg/hgimport.py", line 262,
in commit_handler
self.idmap[cmd.id] = self.putcommit(modified, modes, copies, cmt)
File ".../lib/git_remote_helpers/hg/hgimport.py", line 294, in putcommit
self.repo.commitctx(ctx)
File "/lib/python/mercurial/localrepo.py", line 1315, in commitctx
phases.retractboundary(self, targetphase, [n])
File "/lib/python/mercurial/phases.py", line 201, in retractboundary
currentroots.intersection_update(ctx.node() for ctx in ctxs)
File "/lib/python/mercurial/phases.py", line 201, in <genexpr>
currentroots.intersection_update(ctx.node() for ctx in ctxs)
File "/lib/python/mercurial/localrepo.py", line 264, in set
for r in self.revs(expr, *args):
TypeError: 'set' object is not callable
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In this case: David Soria Parra <dsp <at> php.net>.
With this last of three Postel patches, remote-hg can import the
Mercurial repository completely.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This change allows invalid input from Mercurial repositories where the
author is recorded as 'Name <email@blah' (missing the closing '>').
With this change, importing http://scelenic.com/hg itself no longer fails
with:
fatal: Missing > in ident string: Benoit Boissinot
<benoit.boissinot@ens-lyon.org <none@none> 1129685868 -0700
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We should handle a missing space before the email part of an author ident
gracefully. See for example the icedtea6 repository.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Facilitate writing import-export based helpers in python by
refactoring common code to a base class.
[jes: rebased to newer upstream Git]
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
This works now that fast-export has been fixed to properly handle
refs that point to a commit that was not exported during the current
fast-export run.
This will be required by fast-export, when no commits were
exported, but the refs should be set, of course.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
When calling `git fast-export master..next` we want to export
refs/heads/next, but not refs/heads/master.
Currently this is not a problem, because negative refs' commits
are never shown. In the next commit this will be changed in order
to make sure that 'master..master' does export master. I.e. even
refs whose commits are not shown are exported.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
This will be required by fast-export, when no commits were
exported, but the refs should be set, of course.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
When calling `git fast-export a..a b` when a and b refer to the same
commit, nothing would be exported, and an incorrect reset line would
be printed for b ('from :0').
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
This happens only when the corresponding commits are not exported in
the current fast-export run. This can happen either when the relevant
commit is already marked, or when the commit is explicitly marked
as UNINTERESTING with a negative ref by another argument.
This breaks fast-export basec remote helpers.
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
Along the lines of 05d0e3b and f33946d, use cat instead of echo to avoid
line ending mismatches in the test result of "am empty-file does not
infloop" which make the test fail.
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
On Windows XP (not Win7), directories cannot be deleted while a find handle
is open, causing "Deletion of directory '...' failed. Should I try again?"
prompts.
Prior to 19d1e75d "Win32: Unicode file name support (except dirent)",
these failures were silently ignored due to strbuf_free in is_dir_empty
resetting GetLastError to ERROR_SUCCESS.
Close the find handle in is_dir_empty so that git doesn't block deletion
of the directory even after all other applications have released it.
Reported-by: John Chen <john0312@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Fix Windows specific environment settings on startup rather than checking
for special values on every getenv call.
As a side effect, this makes the patched environment (i.e. with properly
initialized TMPDIR and TERM) available to child processes.
Signed-off-by: Karsten Blees <blees@dcon.de>
The Windows environment is sorted, keep it that way for O(log n)
environment access.
Change compareenv to compare only the keys, so that it can be used to
find an entry irrespective of the value.
Change lookupenv to binary seach for an entry. Return one's complement of
the insert position if not found (libc's bsearch returns NULL).
Replace MSVCRT's getenv with a minimal do_getenv based on the binary search
function.
Change do_putenv to insert new entries at the correct position. Simplify
the function by swapping if conditions and using memmove instead of for
loops.
Move qsort from make_environment_block to mingw_startup. We still need to
sort on startup to make sure that the environment is sorted according to
our compareenv function (while Win32 / CreateProcess requires the
environment block to be sorted case-insensitively, CreateProcess currently
doesn't enforce this, and some applications such as bash just don't care).
Note that environment functions are _not_ thread-safe and are not required
to be so by POSIX, the application is responsible for synchronizing access
to the environment. MSVCRT's getenv and our new getenv implementation are
better than that in that they are thread-safe with respect to other getenv
calls as long as the environment is not modified. Git's indiscriminate use
of getenv in background threads currently requires this property.
Signed-off-by: Karsten Blees <blees@dcon.de>
As of d41489a6 "Add more large blob test cases", git's high-level memory
allocation functions (xmalloc, xmemdupz etc.) access the environment to
simulate limited memory in tests (see 'getenv("GIT_ALLOC_LIMIT")' in
memory_limit_check()). These functions should not be used before the
environment is fully initialized (particularly not to initialize the
environment itself).
The current solution ('environ = NULL; ALLOC_GROW(environ...)') only works
because MSVCRT's getenv() reinitializes environ when it is NULL (i.e. it
leaves us with two sets of unusabe (non-UTF-8) and unfreeable (CRT-
allocated) environments).
Add our own set of malloc-or-die functions to be used in startup code.
Also check the result of __wgetmainargs, which may fail if there's not
enough memory for wide-char arguments and environment.
This patch is in preparation of the sorted environment feature, which
completely replaces MSVCRT's getenv() implementation.
Signed-off-by: Karsten Blees <blees@dcon.de>
Move environment array reallocation from do_putenv to the respective
callers. Keep track of the environment size in a global variable. Use
ALLOC_GROW in mingw_putenv to reduce reallocations. Allocate a
sufficiently sized environment array in make_environment_block to prevent
reallocations.
Signed-off-by: Karsten Blees <blees@dcon.de>
When spawning child processes via start_command(), the environment and all
environment entries are copied twice. First by make_augmented_environ /
copy_environ to merge with child_process.env. Then a second time by
make_environment_block to create a sorted environment block string as
required by CreateProcess.
Move the merge logic to make_environment_block so that we only need to copy
the environment once. This changes semantics of the env parameter: it now
expects a delta (such as child_process.env) rather than a full environment.
This is not a problem as the parameter is only used by start_command()
(all other callers previously passed char **environ, and now pass NULL).
The merge logic no longer xstrdup()s the environment strings, so do_putenv
must not free them. Add a parameter to distinguish this from normal putenv.
Remove the now unused make_augmented_environ / free_environ API.
Signed-off-by: Karsten Blees <blees@dcon.de>
Environment helper functions use random naming ('env' prefix or suffix or
both, with or without '_'). Change to POSIX naming scheme ('env' suffix,
no '_').
Env_setenv has more in common with putenv than setenv. Change to do_putenv.
Signed-off-by: Karsten Blees <blees@dcon.de>
Move environment helper functions up so that they can be reused by
mingw_getenv and mingw_spawnve_fd in subsequent patches.
Signed-off-by: Karsten Blees <blees@dcon.de>
The only public spawn function that needs to tweak the environment is
mingw_spawnvpe (called from start_command). Nevertheless, all internal
spawn* functions take an env parameter and needlessly pass the global
char **environ around. Remove the env parameter where it's not needed.
This removes the internal mingw_execve abstraction, which is no longer
needed.
Signed-off-by: Karsten Blees <blees@dcon.de>
The environment on Windows is case-insensitive. Some environment functions
(such as unsetenv and make_augmented_environ) have always used case-
sensitive comparisons instead, while others (getenv, putenv, sorting in
spawn*) were case-insensitive.
Prevent potential inconsistencies by using case-insensitive comparison in
lookup_env (used by putenv, unsetenv and make_augmented_environ).
Signed-off-by: Karsten Blees <blees@dcon.de>
All functions that modify the environment have memory leaks.
Disable gitunsetenv in the Makefile and use env_setenv (via mingw_putenv)
instead (this frees removed environment entries).
Move xstrdup from env_setenv to make_augmented_environ, so that
mingw_putenv no longer copies the environment entries (according to POSIX
[1], "the string [...] shall become part of the environment"). This also
fixes the memory leak in gitsetenv, which expects a POSIX compliant putenv.
[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/putenv.html
Note: This patch depends on taking control of char **environ and having
our own mingw_putenv (both introduced in "Win32: Unicode environment
(incoming)").
Signed-off-by: Karsten Blees <blees@dcon.de>
On Windows, all native APIs are Unicode-based. It is impossible to pass
legacy encoded byte arrays to a process via command line or environment
variables. Disable the tests that try to do so.
In t3901, most tests still work if we don't mess up the repository encoding
in setup, so don't switch to ISO-8859-1 on MinGW.
Note that i18n tests that do their encoding tricks via encoded files (such
as t3900) are not affected by this.
Signed-off-by: Karsten Blees <blees@dcon.de>
Convert environment from UTF-16 to UTF-8 on startup.
No changes to getenv() are necessary, as the MSVCRT version is implemented
on top of char **environ.
However, putenv / _wputenv from MSVCRT no longer work, for two reasons:
1. they try to keep environ, _wenviron and the Win32 process environment
in sync, using the default system encoding instead of UTF-8 to convert
between charsets
2. msysgit and MSVCRT use different allocators, memory allocated in git
cannot be freed by the CRT and vice versa
Implement mingw_putenv using the env_setenv helper function from the
environment merge code.
Note that in case of memory allocation failure, putenv now dies with error
message (due to xrealloc) instead of failing with ENOMEM. As git assumes
setenv / putenv to always succeed, this prevents it from continuing with
incorrect settings.
Signed-off-by: Karsten Blees <blees@dcon.de>
Use the same Unicode conversion functions for file names and console
conversions so that the file system and console output are in sync when
checking out legacy encoded repositories (i.e. with invalid UTF-8 file
names).
Signed-off-by: Karsten Blees <blees@dcon.de>