This gives us a significant speedup when adding, committing and stat'ing files.
Also, since Windows doesn't really handle symlinks, we let stat just uses lstat.
We also need to replace fstat, since our implementation and the standard stat()
functions report slightly different timestamps, possibly due to timezones.
We simply report UTC in our implementation, and do our FILETIME to time_t
conversion based on the document at http://support.microsoft.com/kb/167296.
With Moe's repo structure (100K files in 100 dirs, containing 2-4 bytes)
mkdir bummer && cd bummer; for ((i=0;i<100;i++)); do
mkdir $i && pushd $i;
for ((j=0;j<1000;j++)); do echo "$j" >$j; done;
popd;
done
We get the following performance boost:
With normal lstat & stat Custom lstat/fstat
------------------------ ------------------------
Command: git init Command: git init
------------------------ ------------------------
real 0m 0.047s real 0m 0.063s
user 0m 0.031s user 0m 0.015s
sys 0m 0.000s sys 0m 0.015s
------------------------ ------------------------
Command: git add . Command: git add .
------------------------ ------------------------
real 0m19.390s real 0m12.031s 1.6x
user 0m 0.015s user 0m 0.031s
sys 0m 0.030s sys 0m 0.000s
------------------------ ------------------------
Command: git commit -a.. Command: git commit -a..
------------------------ ------------------------
real 0m30.812s real 0m16.875s 1.8x
user 0m 0.015s user 0m 0.015s
sys 0m 0.000s sys 0m 0.015s
------------------------ ------------------------
3x Command: git-status 3x Command: git-status
------------------------ ------------------------
real 0m11.860s real 0m 5.266s 2.2x
user 0m 0.015s user 0m 0.015s
sys 0m 0.015s sys 0m 0.015s
real 0m11.703s real 0m 5.234s
user 0m 0.015s user 0m 0.015s
sys 0m 0.000s sys 0m 0.000s
real 0m11.672s real 0m 5.250s
user 0m 0.031s user 0m 0.015s
sys 0m 0.000s sys 0m 0.000s
------------------------ ------------------------
Command: git commit... Command: git commit...
(single file) (single file)
------------------------ ------------------------
real 0m14.234s real 0m 7.735s 1.8x
user 0m 0.015s user 0m 0.031s
sys 0m 0.000s sys 0m 0.000s
Signed-off-by: Marius Storm-Olsen <mstormo_git@storm-olsen.com>
This is a wrapper for mkstemp() that performs error checking and
calls die() when an error occur.
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Windows's rename() is based on the MoveFile() API, which fails if the
destination exists. Here we work around the problem by using MoveFileEx().
Furthermore, the posixly correct error is returned if the destination is
a directory.
The implementation is still slightly incomplete, however, because of the
missing error code translation: We assume that the failure is due to
permissions.
This defines xdup() and xfdopen() in git-compat-util.h to give
us error-catching variants of them without cluttering the code
too much.
Signed-off-by: Jim Meyering <jim@meyering.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function converts the value of h_errno (last error of name
resolver library, see netdb.h).
One of systems which supposedly do not have the function is SunOS.
POSIX does not mandate its presence.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
format-patch: add MIME-Version header when we add content-type.
Fixed link in user-manual
import-tars: Use the "Link indicator" to identify directories
git name-rev writes beyond the end of malloc() with large generations
Documentation/branch: fix small typo in -D example
When using git name-rev on my kernel tree I triggered a malloc()
corruption warning from glibc.
apw@pinky$ git log --pretty=one $N/base.. | git name-rev --stdin
*** glibc detected *** malloc(): memory corruption: 0x0bff8950 ***
Aborted
This comes from name_rev() which is building the name of the revision
in a malloc'd string, which it sprintf's into:
char *new_name = xmalloc(len + 8);
[...]
sprintf(new_name, "%.*s~%d^%d", len, tip_name,
generation, parent_number);
This allocation is only sufficient if the generation number is
less than 5 digits, in my case generation was 13432. In reality
parent_number can be up to 16 so that also can require two digits,
reducing us to 3 digits before we are at risk of blowing this
allocation.
This patch introduces a decimal_length() which approximates the
number of digits a type may hold, it produces the following:
Type Longest Value Len Est
---- ------------- --- ---
unsigned char 256 3 4
unsigned short 65536 5 6
unsigned long 4294967296 10 11
unsigned long long 18446744073709551616 20 21
char -128 4 4
short -32768 6 6
long -2147483648 11 11
long long -9223372036854775808 20 21
This is then used to size the new_name.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* maint:
gitweb: use decode_utf8 directly
posix compatibility for t4200
Document 'opendiff' value in config.txt and git-mergetool.txt
Allow PERL_PATH="/usr/bin/env perl"
Make xstrndup common
diff.c: fix "size cache" handling.
http-fetch: Disable use of curl multi support for libcurl < 7.16.
This also improves the implementation to match how strndup is
specified (by GNU): if the length given is longer than the string,
only the string's length is allocated and copied, but the string need
not be null-terminated if it is at least as long as the given length.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* maint:
Start preparing for 1.5.1.3
Sanitize @to recipients.
git-svn: Ignore usernames in URLs in find_by_url
Document --dry-run and envelope-sender for git-send-email.
Allow users to optionally specify their envelope sender.
Ensure clean addresses are always used with Net::SMTP
Validate @recipients before using it for sendmail and Net::SMTP.
Perform correct quoting of recipient names.
Change the scope of the $cc variable as it is not needed outside of send_message.
Debugging cleanup improvements
Prefix Dry- to the message status to denote dry-runs.
Document --dry-run parameter to send-email.
git-svn: Don't rely on $_ after making a function call
Fix handle leak in write_tree
Actually handle some-low memory conditions
Conflicts:
RelNotes
git-send-email.perl
Tim Ansell discovered his Debian server didn't permit git-daemon to
use as much memory as it needed to handle cloning a project with
a 128 MiB packfile. Filtering the strace provided by Tim of the
rev-list child showed this gem of a sequence:
open("./objects/pack/pack-*.pack", O_RDONLY|O_LARGEFILE <unfinished ...>
<... open resumed> ) = 5
OK, so the packfile is fd 5...
mmap2(NULL, 33554432, PROT_READ, MAP_PRIVATE, 5, 0 <unfinished ...>
<... mmap2 resumed> ) = 0xb5e2d000
and we mapped one 32 MiB window from it at position 0...
mmap2(NULL, 31020635, PROT_READ, MAP_PRIVATE, 5, 0x6000 <unfinished ...>
<... mmap2 resumed> ) = -1 ENOMEM (Cannot allocate memory)
And we asked for another window further into the file. But got
denied. In Tim's case this was due to a resource limit on the
git-daemon process, and its children.
Now where are we in the code? We're down inside use_pack(),
after we have called unuse_one_window() enough times to make sure
we stay within our allowed maximum window size. However since we
didn't unmap the prior window at 0xb5e2d000 we aren't exceeding
the current limit (which probably was just the defaults).
But we're actually down inside xmmap()...
So we release the window we do have (by calling release_pack_memory),
assuming there is some memory pressure...
munmap(0xb5e2d000, 33554432 <unfinished ...>
<... munmap resumed> ) = 0
close(5 <unfinished ...>
<... close resumed> ) = 0
And that was the last window in this packfile. So we closed it.
Way to go us. Our xmmap did not expect release_pack_memory to
close the fd its about to map...
mmap2(NULL, 31020635, PROT_READ, MAP_PRIVATE, 5, 0x6000 <unfinished ...>
<... mmap2 resumed> ) = -1 EBADF (Bad file descriptor)
And so the Linux kernel happily tells us f' off.
write(2, "fatal: ", 7 <unfinished ...>
<... write resumed> ) = 7
write(2, "Out of memory? mmap failed: Bad "..., 47 <unfinished ...>
<... write resumed> ) = 47
And we report the bad file descriptor error, and not the ENOMEM,
and die, claiming we are out of memory. But actually that mmap
should have succeeded, as we had enough memory for that window,
seeing as how we released the prior one.
Originally when I developed the sliding window mmap feature I had
this exact same bug in fast-import, and I dealt with it by handing
in the struct packed_git* we want to open the new window for, as the
caller wasn't prepared to reopen the packfile if unuse_one_window
closed it. The same is true here from xmmap, but the caller doesn't
have the struct packed_git* handy. So I'm using the file descriptor
instead to perform the same test.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* np/pack: (27 commits)
document --index-version for index-pack and pack-objects
pack-objects: remove obsolete comments
pack-objects: better check_object() performances
add get_size_from_delta()
pack-objects: make in_pack_header_size a variable of its own
pack-objects: get rid of create_final_object_list()
pack-objects: get rid of reuse_cached_pack
pack-objects: clean up list sorting
pack-objects: rework check_delta_limit usage
pack-objects: equal objects in size should delta against newer objects
pack-objects: optimize preferred base handling a bit
clean up add_object_entry()
tests for various pack index features
use test-genrandom in tests instead of /dev/urandom
simple random data generator for tests
validate reused pack data with CRC when possible
allow forcing index v2 and 64-bit offset treshold
pack-redundant.c: learn about index v2
show-index.c: learn about index v2
sha1_file.c: learn about index version 2
...
* builtin-grep.c (strtoul_ui): Move function definition from here, to...
* git-compat-util.h (strtoul_ui): ...here, with an added "base" parameter.
* builtin-grep.c (cmd_grep): Update use of strtoul_ui to include base, "10".
* builtin-update-index.c (read_index_info): Diagnose an invalid mode integer
that is out of range or merely larger than INT_MAX.
(cmd_update_index): Use strtoul_ui, not sscanf.
* convert-objects.c (write_subdirectory): Likewise.
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* builtin-grep.c (strtoul_ui): Move function definition from here, to...
* git-compat-util.h (strtoul_ui): ...here, with an added "base" parameter.
* builtin-grep.c (cmd_grep): Update use of strtoul_ui to include base, "10".
* builtin-update-index.c (read_index_info): Diagnose an invalid mode integer
that is out of range or merely larger than INT_MAX.
(cmd_update_index): Use strtoul_ui, not sscanf.
* convert-objects.c (write_subdirectory): Likewise.
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch introduces the MSB() macro to obtain the desired number of
most significant bits from a given variable independently of the variable
type.
It is then used to better implement the overflow test on the OBJ_OFS_DELTA
base offset variable with the property of always working correctly
regardless of the type/size of that variable.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes a problem reported by Randal Schwartz:
>I finally tracked down all the (albeit inconsequential) errors I was getting
>on both OpenBSD and OSX. It's the warn() function in usage.c. There's
>warn(3) in BSD-style distros. It'd take a "great rename" to change it, but if
>someone with better C skills than I have could do that, my linker and I would
>appreciate it.
It was annoying to me, too, when I was doing some mergetool testing on
Mac OS X, so here's a fix.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Randal L. Schwartz" <merlyn@stonehenge.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Even though the file systems on Windows do not support symbolic links,
repositories can still contain information about symbolic links, which is
interrogated at many occasions. Without the correct definitions S_ISLNK
and S_IFLNK, symbolic link information in the repository is not recognized
and treated as corruption.
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.
On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().
In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Not all platforms have declared 'unsigned long' to be a 64 bit value,
but we want to support a 64 bit packfile (or close enough anyway)
in the near future as some projects are getting large enough that
their packed size exceeds 4 GiB.
By using off_t, the POSIX type that is declared to mean an offset
within a file, we support whatever maximum file size the underlying
operating system will handle. For most modern systems this is up
around 2^60 or higher.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* maint:
Unset NO_C99_FORMAT on Cygwin.
Fix a "pointer type missmatch" warning.
Fix some "comparison is always true/false" warnings.
Fix an "implicit function definition" warning.
Fix a "label defined but unreferenced" warning.
Document the config variable format.suffix
git-merge: fail correctly when we cannot fast forward.
builtin-archive: use RUN_SETUP
Fix git-gc usage note
The function at issue being initgroups() from the <grp.h> header
file. On Cygwin, setting _XOPEN_SOURCE suppresses the definition
of initgroups(), which causes the warning while compiling daemon.c.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Glibc uses the same size for int and off_t by default.
In order to support large pack sizes (>2GB) we force Glibc to a 64bit off_t.
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Solaris 8 was pre-c99, and they weren't willing to commit to
the strtoumax definition according to /usr/include/inttypes.h.
This adds NO_STRTOUMAX and NO_STRTOULL for ancient systems.
If NO_STRTOUMAX is defined, the routine in compat/strtoumax.c
will be used instead. That routine passes its arguments to
strtoull unless NO_STRTOULL is defined. If NO_STRTOULL, then
the routine uses strtoul (unsigned long).
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Acked-by: Shawn O Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
As it turns out, the things returned by Winsock2's socket() are handles
that can be passed to ReadFile()/WriteFile() - almost. The way this works
is by wrapping those handles into file descriptors with _open_osfhandle().
But it turns out that the sockets created by the plain socket() function
are prepared for "overlapped" I/O, which confuses ReadFile()/WriteFile().
Therefore, a reimplementation is provided that uses WSASocket() to
explicitly asks for non-overlapped sockets.
Special thanks got to H. Peter Anvin, who provided the necessary clues.