Commit Graph

70763 Commits

Author SHA1 Message Date
Jeff Hostetler
5e493369e1 hashmap: allow memihash computation to be continued
Add variant of memihash() to allow the hash computation to
be continued.  There are times when we compute the hash on
a full path and then the hash on just the path to the parent
directory.  This can be expensive on large repositories.

With this, we can hash the parent directory first. And then
continue the computation to include the "/filename".

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-26 16:59:10 +02:00
Jeff Hostetler
17d5dfa668 name-hash: specify initial size for istate.dir_hash table
Specify an initial size for the istate.dir_hash HashMap matching
the size of the istate.name_hash.

Previously hashmap_init() was given 0, causing a 64 bucket
hashmap to be created.  When working with very large
repositories, this would cause numerous rehash() calls to
realloc and rebalance the hashmap. This is especially true
when the worktree is deep, with many directories containing
a few files.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-26 16:59:10 +02:00
Johannes Schindelin
b2185ebf1a Start the merging-rebase to upstream/maint
This commit starts the rebase of b213911f46 to 49800c9407

This merging rebase was started not because upstream's `maint` branch
advanced (it did not), but to replace a couple of patches with newer
iterations, as sent to the upstream Git project independently. In
particular, those are:

- The patch series merged by fe73baf344
  (jeffhostetler/jeffhostetler/quick_add_index_entry) was replaced by
  the most recent iteration of 'jh/add-index-entry-optim' (b986df5c35)

- The Pull Request jeffhostetler/jeffhostetler/string_list_realloc
  merged via c775bdd100 was replaced by the latest iteration of
  'jh/string-list-micro-optim' (950a234cbd)

- The jh/memihash-opt patches merged by 2862058e9d were replaced by the
  newest iteration (41b3eb4a6b)

- The bug fix where difftool used a buffer after freeing it
  (d33e487771) was replaced by the one that made it into upstream
  (882add136f)

- The patch in the `coverity` series that tried to fix a resource leak
  in git-am (a5208164e2) was replaced by a better patch submitted by
  Ren_ Scharfe (ac8ce18d89)

- The patch in the `coverity` series that tried to fix a resource leak
  in the `handle_ssh_variant()` function (f07be76f51) has been dropped,
  as a different patch had been accepted into `pu` already

- The rebase-i-extra patches (e1be548aaf) were replaced by the latest
  iteration (1d3d10b9e15)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 16:55:54 +02:00
Johannes Schindelin
bf1cd086d6 Merge branch 'coverity'
Coverity is a tool to analyze code statically, trying to find common (or
not so common) problems before they occur in production.

Coverity offers its services to Open Source software, and just like
upstream Git, Git for Windows applied and was granted the use.

While Coverity reports a lot of false positives due to Git's (ab-)use of
the FLEX_ARRAY feature (where it declares a 0-byte or 1-byte array at the
end of a struct, and then allocates a variable-length data structure
holding a variable-length string at the end, so that the struct as well as
the string can be released with a single free()), there were a few issues
reported that are true positives, and not all of them were resource leaks
in builtins (for which it is considered kind of okay to not release memory
just before exit() is called anyway).

This topic branch tries to address a couple of those issues.

Note: there are a couple more issues left, either because they are tricky
to resolve (in some cases, the custody of occasionally-allocated memory is
very unclear) or because it is unclear whether they are false positives
(due to the hard-to-reason-about nature of the code). It's a start,
though.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:06 +02:00
Johannes Schindelin
37dabece6d Merge branch 'drive-prefix'
This topic branch allows us to specify absolute paths without the drive
prefix e.g. when cloning.

Example:

	C:\Users\me> git clone https://github.com/git/git \upstream-git

This will clone into a new directory C:\upstream-git, in line with how
Windows interprets absolute paths.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:05 +02:00
Johannes Schindelin
f07be76f51 handle_ssh_variant: plug memory leak
The split_cmdline() function either assigns an allocated array to the argv
parameter or NULL. In the first case, we have to free() it, in the latter
it does no harm.

Reported via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:04 +02:00
Johannes Schindelin
a5208164e2 am: close input stream even in case of an error
Reported via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:04 +02:00
Johannes Schindelin
6b88dd4e15 submodule_uses_worktrees(): plug memory leak
There is really no reason why we would need to hold onto the allocated
string longer than necessary.

Reported by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:04 +02:00
Johannes Schindelin
219dce8722 show_worktree(): plug memory leak
The buffer allocated by shorten_unambiguous_ref() needs to be released.

Discovered by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:03 +02:00
Johannes Schindelin
272448d64f name-rev: avoid leaking memory in the deref case
When the `name_rev()` function is asked to dereference the tip name, it
allocates memory. But when it turns out that another tip already
described the commit better than the current one, we forgot to release
the memory.

Pointed out by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:03 +02:00
Johannes Schindelin
25dc863baf remote: plug memory leak in match_explicit()
The `guess_ref()` returns an allocated buffer of which `make_linked_ref()`
does not take custody (`alloc_ref()` makes a copy), therefore we need to
release the buffer afterwards.

Noticed via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:03 +02:00
Johannes Schindelin
2042ba35f6 add_reflog_for_walk: avoid memory leak
We free()d the `log` buffer when dwim_log() returned 1, but not when it
returned a larger value (which meant that it still allocated the buffer
but we simply ignored it).

Identified by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:03 +02:00
Johannes Schindelin
1d3003b2d7 shallow: avoid memory leak
Reported by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:03 +02:00
Johannes Schindelin
9b4901d6f5 line-log: avoid memory leak
Discovered by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:03 +02:00
Johannes Schindelin
58a3b2b273 receive-pack: plug memory leak in update()
Reported via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:03 +02:00
Johannes Schindelin
297da394ed fast-export: avoid leaking memory in handle_tag()
Reported by, you guessed it, Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:02 +02:00
Johannes Schindelin
c7e0a2e8ec mktree: plug memory leaks reported by Coverity
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:02 +02:00
Johannes Schindelin
c867f5692f pack-redundant: plug memory leak
Identified via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:02 +02:00
Johannes Schindelin
7a8ea2c461 setup_discovered_git_dir(): fix memory leak
Identified by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:02 +02:00
Johannes Schindelin
52b6bc73ac setup_bare_git_dir(): fix memory leak
Reported by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:02 +02:00
Johannes Schindelin
c1de0a778e split_commit_in_progress(): fix memory leak
Reported via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:02 +02:00
Johannes Schindelin
665a95f7e7 checkout: fix memory leak
Discovered via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:02 +02:00
Johannes Schindelin
7cc78ac6fc cat-file: fix memory leak
Discovered by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:01 +02:00
Johannes Schindelin
917eead1fb Check for EOF while parsing mails
Reported via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:01 +02:00
Johannes Schindelin
152312932e status: close file descriptor after reading git-rebase-todo
Reported via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:01 +02:00
Johannes Schindelin
48c8dccc9b difftool: close file descriptors after reading
Spotted by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:01 +02:00
Johannes Schindelin
34fa615f97 http-backend: avoid memory leaks
Reported via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:01 +02:00
Johannes Schindelin
cf05da96a3 get_mail_commit_oid(): avoid resource leak
When we fail to read, or parse, the file, we still want to close the file
descriptor and release the strbuf.

Reported via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:01 +02:00
Johannes Schindelin
523ea055e8 git_config_rename_section_in_file(): avoid resource leak
In case of errors, we really want the file descriptor to be closed.

Discovered by a Coverity scan.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:01 +02:00
Johannes Schindelin
5fe7e42028 add_commit_patch_id(): avoid allocating memory unnecessarily
It would appear that we allocate (and forget to release) memory if the
patch ID is not even defined.

Reported by the Coverity tool.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:00 +02:00
Johannes Schindelin
a655c11256 sequencer: plug resource leak when skipping unnecessary picks
The resource leak only happens in case of an error writing or truncating
the file, therefore it seems less critical, but we should still fix it
nonetheless.

Discovered by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:00 +02:00
Johannes Schindelin
09b998a6af winansi: avoid buffer overrun
When we could not convert the UTF-8 sequence into Unicode for writing to
the Console, we should not try to write an insanely-long sequence of
invalid wide characters (mistaking the negative return value for an
unsigned length).

Reported by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:00 +02:00
Johannes Schindelin
2c9bc7ed8d winansi: avoid use of uninitialized value
When stdout is not connected to a Win32 console, we incorrectly used an
uninitialized value for the "plain" character attributes.

Detected by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:00 +02:00
Johannes Schindelin
edffb73b20 mingw: avoid memory leak when splitting PATH
In the (admittedly, concocted) case that PATH consists only of colons, we
would leak the duplicated string.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:00 +02:00
Johannes Schindelin
f811241018 fixup! Help debugging with MSys2 by optionally executing bash with strace
Identified by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:24:00 +02:00
Johannes Schindelin
57295d8f55 fixup! mingw: use domain information for default email
Noticed by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:23:59 +02:00
Johannes Schindelin
156b911e83 mingw: allow absolute paths without drive prefix
When specifying an absolute path without a drive prefix, we convert that
path internally. Let's make sure that we handle that case properly, too
;-)

This fixes the command

	git clone https://github.com/git-for-windows/git \G4W

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:23:58 +02:00
Johannes Schindelin
ce8b4c0617 mingw: demonstrate a problem with certain absolute paths
On Windows, there are several categories of absolute paths. One such
category starts with a backslash and is implicitly relative to the
drive associated with the current working directory. Example:

	c:
	git clone https://github.com/git-for-windows/git \G4W

should clone into C:\G4W.

There is currently a problem with that, in that mingw_mktemp() does not
expect the _wmktemp() function to prefix the absolute path with the
drive prefix, and as a consequence, the resulting path does not fit into
the originally-passed string buffer. The symptom is a "Result too large"
error.

Reported by Juan Carlos Arevalo Baeza.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-26 12:23:58 +02:00
Johannes Schindelin
91e25e0ef0 Merge pull request #1149 from jeffhostetler/jeffhostetler/do_write_index_mtime
read-cache: close index.lock in do_write_index
2017-04-26 12:00:24 +02:00
Jeff Hostetler
d32cf6ff61 read-cache: close index.lock in do_write_index
Teach do_write_index() to close the index.lock file
before getting the mtime and updating the istate.timestamp
fields.

On Windows, a file's mtime is not updated until the file is
closed.  On Linux, the mtime is set after the last flush.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2017-04-25 15:25:49 -04:00
Johannes Schindelin
d14a8f8640 abspath_part_inside_repo: respect core.fileMode
If the file system is case-insensitive, we really must be careful to
ignore differences in case only.

This fixes https://github.com/git-for-windows/git/issues/735

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-12 21:12:42 +02:00
Johannes Schindelin
d33e487771 difftool: fix use-after-free
The left and right base directories were pointed to the buf field of
two strbufs, which were subject to change.

Let's just copy the strings and be done with it.

This fixes https://github.com/git-for-windows/git/issues/1124

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-12 21:12:42 +02:00
Kevin Willford
4c8c909c94 name-hash: fix buffer overrun
Add check for the end of the entries for the thread partition.
Add test for lazy init name hash with specific directory structure

The lazy init hash name was causing a buffer overflow when the last
entry in the index was multiple folder deep with parent folders that
did not have any files in them.

This adds a test for the boundary condition of the thread partitions
with the folder structure that was triggering the buffer overflow.

The fix was to check if it is the last entry for the thread partition
in the handle_range_dir and not try to use the next entry in the cache.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-12 21:12:41 +02:00
Johannes Schindelin
2862058e9d Merge branch 'jh/memihash-opt'
The name-hash used for detecting paths that are different only in
cases (which matter on case insensitive filesystems) has been
optimized to take advantage of multi-threading when it makes sense.

* jh/memihash-opt:
  name-hash: add test-lazy-init-name-hash to .gitignore
  name-hash: add perf test for lazy_init_name_hash
  name-hash: add test-lazy-init-name-hash
  name-hash: perf improvement for lazy_init_name_hash
  hashmap: document memihash_cont, hashmap_disallow_rehash api
  hashmap: add disallow_rehash setting
  hashmap: allow memihash computation to be continued
  name-hash: specify initial size for istate.dir_hash table

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-12 21:12:41 +02:00
Johannes Schindelin
1a68bfc5f8 Merge branch 'skip-gettext-when-possible'
This topic branch allows us to skip the gettext initialization
when the locale directory does not even exist.

This saves 150ms out of 210ms for a simply `git version` call on
Windows, and it most likely will help scripts that call out to
`git.exe` hundreds of times.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2017-04-12 21:12:40 +02:00
Ramsay Jones
6230ea5323 name-hash: add test-lazy-init-name-hash to .gitignore
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-12 21:12:39 +02:00
Jeff Hostetler
74900fb89d name-hash: add perf test for lazy_init_name_hash
Created t/perf/p0004-lazy-init-name-hash.sh test
to demonstrate correctness and performance gains
with the multithreaded version of lazy_init_name_hash().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-12 21:12:39 +02:00
Jeff Hostetler
5d61c53741 name-hash: add test-lazy-init-name-hash
Add t/helper/test-lazy-init-name-hash.c test code
to demonstrate performance times for lazy_init_name_hash()
using the original single-threaded and the new multi-threaded
code paths.

Includes a --dump option to dump the created hashmaps to
stdout.  You can use this to run both code paths and
confirm that they generate the same hashmaps.

Includes a --analyze option to analyze performance of both
code paths over a range of index sizes to help you find a
lower bound for the LAZY_THREAD_COST in name-hash.c.
For example, passing "-a 4000" will set "istate.cache_nr"
to 4000 and then try the multi-threaded code -- probably
giving 2 threads with 2000 entries each.  It will then
run both the single-threaded (1x4000) and the multi-threaded
(2x2000) and compare the times.  It will then repeat the
test with 8000, 12000, and etc. so that you can see the
cross over.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-12 21:12:39 +02:00
Jeff Hostetler
6d1283030b name-hash: perf improvement for lazy_init_name_hash
Improve performance of lazy_init_name_hash() when
ignore_case is set.  Teach name-hash to build the
istate.name_hash and istate.dir_hash simultaneously
using a forward-diving technique on the pathname
of the index_entry, rather than adding name_hash
entries and then searching backwards in the pathname
for parent directories.

This borrows algorithm ideas from clear_ce_flags_{1,dir}.

Multiple threads are used with the new algorithm to
speed hashmap construction.

This new code path is only used when threads are present
(a compiler settings) and when the index is large enough
to warrant the pthread complexity.

The code in clear_ce_flags_dir() uses a linear search to
find the adjacent index entries with the same prefix; a
binary search is used here handle_range_dir() to further
speed things up.

The size of LAZY_THREAD_COST was determined from rough
analysis using:
    t/helper/test-lazy-init-name-hash --analyze

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-12 21:12:39 +02:00
Jeff Hostetler
6f0b129c9b hashmap: document memihash_cont, hashmap_disallow_rehash api
Document memihash_cont() and hashmap_disallow_rehash()
in Documentation/technical/api-hashmap.txt.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-12 21:12:39 +02:00