Commit Graph

2225 Commits

Author SHA1 Message Date
Jeff Hostetler
c75bf8046f msvc: get rid of the MSVC_DEPS variable
As we do not consume NuGet packages any longer, there is no sense to try
to point PATH to their unpacked .dll files, either.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-01-22 13:06:46 +01:00
Jeff Hostetler
b79e9d76a6 vcpkg: get MSVC dependencies with vcpkg rather than nuget
Dependencies such as cURL and OpenSSL are necessary to build and run
Git. Previously, we obtained those dependencies by fetching NuGet
packages.

However, it is notoriously hard to keep NuGet packages of C/C++
libraries up-to-date, as the toolsets for different Visual Studio
versions are different, and the NuGet packages would have to ship them
all.

That is the reason why the NuGet packages we use are quite old, and even
insecure in the case of cURL and OpenSSL (the versions contain known
security flaws that have been addressed by later versions for which no
NuGet packages are available).

The better way to handle this situation is to use the vcpkg system:
https://github.com/Microsoft/vcpkg

The idea is that a single Git repository contains enough supporting
files to build up-to-date versions of a large number of Open Source
libraries on demand, including cURL and OpenSSL.

We integrate this system via four new .bat files to

1) initialize the vcpkg system,
2) build the packages,
4) set up Git's Makefile system to find the build artifacts, and
3) copy the artifacts into the top-level directory

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-01-22 13:06:46 +01:00
Jeff Hostetler
ec16248081 MSVC Build: Support VS2017 command line compiler tools
Teach the top-level git Makefile to use whatever VS compiler
tool chain is installed on the system.

When building git from the command line in a git-sdk BASH
window with MAKE, the shell environment has environment
variables for GCC tools, but not MSVC tools.  MSVC bindings
are only avaliable from the various "VcVarsAll.bat" scripts
run by the "Developer Command Prompt" shortcuts.

Add compat/vcbuild/find_vs_env.bat to the Makefile.  It
uses the various "VcVarsAll.bat" scripts in a background
Developer Command Prompt process to compute the proper
environment variables and publish them for use by the Makefile.

[jes: fixed typos, used %SystemRoot% instead of C:\WINDOWS]

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-01-22 13:00:20 +01:00
Johannes Schindelin
c604217962 tests: replace mingw_test_cmp with a helper in C
This helper is slightly more performant than the script with MSYS2's
Bash. And a lot more readable.

To accommodate t1050, which wants to compare files weighing in with 3MB
(falling outside of t1050's malloc limit of 1.5MB), we simply lift the
allocation limit by setting the environment variable GIT_ALLOC_LIMIT to
zero when calling the helper.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-01-22 13:00:15 +01:00
Johannes Schindelin
615c6d6bda Merge branch 'skip-gettext-when-possible'
This topic branch allows us to skip the gettext initialization
when the locale directory does not even exist.

This saves 150ms out of 210ms for a simply `git version` call on
Windows, and it most likely will help scripts that call out to
`git.exe` hundreds of times.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-01-22 13:00:12 +01:00
Johannes Schindelin
11389d5ce6 gettext: use a GIT_LOCALE_PATH relative to $(prefix)
On Windows, we simply pass a POSIX path to bindtextdomain(), relying on
the current libintl-8.dll implementation to handle that gracefully by
resolving the path relative to the "root" directory inferred from the
location of the .dll file itself.

However, not only does this rely on the custom patches of the gettext
library as shipped with MSYS2 (gettext's own source code is not prepared
to handle POSIX paths on Windows), it also means that Git itself cannot
use the `podir` variable at all because it does not handle absolute
POSIX paths in system_path() correctly, leaving them as-is.

This patch fixes that behavior by always using a GIT_LOCALE_PATH
relative to the (runtime) prefix.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-01-22 13:00:07 +01:00
Johannes Schindelin
05cc350320 Merge 'misc-vs-fixes' into HEAD 2018-01-22 13:00:05 +01:00
Johannes Schindelin
9baff99e5d msvc: fix make test without having to play PATH games
When building with Microsoft Visual C, we use NuGet to acquire the
dependencies (such as OpenSSL, cURL, etc). We even unpack those
dependencies.

This patch teaches the test suite to add the directory with the unpacked
.dll files to the PATH before running the tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-01-22 13:00:05 +01:00
Johannes Schindelin
86c4e46a7a Ensure that bin-wrappers/* targets adds .exe if necessary
When compiling with Visual Studio, the projects' names are identical to
the executables modulo the extensions. Which means that the bin-wrappers
*need* to target the .exe files lest they try to execute directories.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-01-22 13:00:02 +01:00
Philip Oakley
62184d4030 Avoid multiple PREFIX definitions
The short and sweet PREFIX can be confused when used in many places.

Rename both usages to better describe their purpose. EXEC_CMD_PREFIX is
used in full to disambiguate it from the nearby GIT_EXEC_PATH.

The PREFIX in sideband.c, while nominally independant of the exec_cmd
PREFIX, does reside within libgit[1], so the definitions would clash
when taken together with a PREFIX given on the command line for use by
exec_cmd.c.

Noticed when compiling Git for Windows using MSVC/Visual Studio [1] which
reports the conflict beteeen the command line definition and the
definition in sideband.c within the libgit project.

[1] the libgit functions are brought into a single sub-project
within the Visual Studio construction script provided in contrib,
and hence uses a single command for both exec_cmd.c and sideband.c.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
2018-01-22 13:00:02 +01:00
Jeff Hostetler
ce1caaca5d vs2015: teach 'make clean' to delete PDBs
Teach main Makefile to also delete the generated PDB files
as well as the PDB files for the various EXE files during
"make MSVC=1 clean".

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2018-01-22 12:59:59 +01:00
Jeff Hostetler
0c88786564 msvc: release mode PDBs and library DLLs
Install required third-party DLLs next to EXEs.

Build and install release mode PDBs for git
executables allowing detailed stack traces
in the event of crash.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2018-01-22 12:59:58 +01:00
Jeff Hostetler
e4dae723b8 msvc: update Makefile and compiler settings for VS2015
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2018-01-22 12:59:57 +01:00
Johannes Schindelin
59a8d34054 Windows: force-recompile git.res for differing architectures
When git.rc is compiled into git.res, the result is actually dependent
on the architecture. That is, you cannot simply link a 32-bit git.res
into a 64-bit git.exe.

Therefore, to allow 32-bit and 64-bit builds in the same directory, we
let git.res depend on GIT-PREFIX so that it gets recompiled when
compiling for a different architecture (this works because the exec path
changes based on the architecture: /mingw32/libexec/git-core for 32-bit
and /mingw64/libexec/git-core for 64-bit).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-01-22 12:59:52 +01:00
Junio C Hamano
a09a5e6c36 Merge branch 'ab/dc-sha1-loose-ends'
Tying loose ends for the recent integration work of
collision-detecting SHA-1 implementation.

* ab/dc-sha1-loose-ends:
  Makefile: NO_OPENSSL=1 should no longer imply BLK_SHA1=1
2018-01-09 14:32:53 -08:00
Junio C Hamano
07b747d324 Merge branch 'jk/test-suite-tracing'
Assorted fixes around running tests with "-x" tracing option.

* jk/test-suite-tracing:
  t/Makefile: introduce TEST_SHELL_PATH
  test-lib: make "-x" work with "--verbose-log"
  t5615: avoid re-using descriptor 4
  test-lib: silence "-x" cleanup under bash
2018-01-05 13:28:09 -08:00
Junio C Hamano
58d1772c85 Merge branch 'js/enhanced-version-info'
"git version --build-options" learned to report the host CPU and
the exact commit object name the binary was built from.

* js/enhanced-version-info:
  version --build-options: report commit, too, if possible
  version --build-options: also report host CPU
2017-12-28 14:08:47 -08:00
Ævar Arnfjörð Bjarmason
edb6a17c36 Makefile: NO_OPENSSL=1 should no longer imply BLK_SHA1=1
Use the collision detecting SHA-1 implementation by default even when
NO_OPENSSL is set.

Setting NO_OPENSSL=UnfortunatelyYes has implied BLK_SHA1=1 ever since
the former was introduced in dd53c7ab29 (Support for NO_OPENSSL,
2005-07-29).  That implication should have been removed when the
default SHA-1 implementation changed from OpenSSL to DC_SHA1 in
e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17).  Finish
what that commit started by removing the BLK_SHA1 fallback setting so
the default DC_SHA1 implementation will be used.

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-28 11:55:56 -08:00
Junio C Hamano
eacf669cec Merge branch 'jt/decorate-api'
A few structures and variables that are implementation details of
the decorate API have been renamed and then the API got documented
better.

* jt/decorate-api:
  decorate: clean up and document API
2017-12-27 11:16:26 -08:00
Junio C Hamano
61061abba7 Merge branch 'jh/object-filtering'
In preparation for implementing narrow/partial clone, the object
walking machinery has been taught a way to tell it to "filter" some
objects from enumeration.

* jh/object-filtering:
  rev-list: support --no-filter argument
  list-objects-filter-options: support --no-filter
  list-objects-filter-options: fix 'keword' typo in comment
  pack-objects: add list-objects filtering
  rev-list: add list-objects filtering support
  list-objects: filter objects in traverse_commit_list
  oidset: add iterator methods to oidset
  oidmap: add oidmap iterator methods
  dir: allow exclusions from blob in addition to file
2017-12-27 11:16:21 -08:00
Junio C Hamano
66d3f19324 Merge branch 'tg/worktree-create-tracking'
The way "git worktree add" determines what branch to create from
where and checkout in the new worktree has been updated a bit.

* tg/worktree-create-tracking:
  add worktree.guessRemote config option
  worktree: add --guess-remote flag to add subcommand
  worktree: make add <path> <branch> dwim
  worktree: add --[no-]track option to the add subcommand
  worktree: add can be created from any commit-ish
  checkout: factor out functions to new lib file
2017-12-19 11:33:57 -08:00
Johannes Schindelin
ed32b788c0 version --build-options: report commit, too, if possible
In particular when local tags are used (or tags that are pushed to some
fork) to build Git, it is very hard to figure out from which particular
revision a particular Git executable was built. It gets worse when those
tags are deleted, or even updated.

Let's just report an exact, unabbreviated commit name in our build
options.

We need to be careful, though, to report when the current commit cannot
be determined, e.g. when building from a tarball without any associated
Git repository. This could be the case also when extracting Git's source
code into an unrelated Git worktree.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-14 22:53:04 -08:00
Eric Sunshine
b22894049f version --build-options: also report host CPU
It can be helpful for bug reports to include information about the
environment in which the bug occurs. "git version --build-options" can
help to supplement this information. In addition to the size of 'long'
already reported by --build-options, also report the host's CPU type.
Example output:

   $ git version --build-options
   git version 2.9.3.windows.2.826.g06c0f2f
   cpu: x86_64
   sizeof-long: 4

New Makefile variable HOST_CPU supports cross-compiling.

Suggested-by: Adric Norris <landstander668@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-14 22:52:49 -08:00
Jonathan Tan
ddd3e31242 decorate: clean up and document API
Improve the names of the identifiers in decorate.h, document them, and
add an example of how to use these functions.

The example is compiled and run as part of the test suite.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 09:16:27 -08:00
Jeff King
3f824e91c8 t/Makefile: introduce TEST_SHELL_PATH
You may want to run the test suite with a different shell
than you use to build Git. For instance, you may build with
SHELL_PATH=/bin/sh (because it's faster, or it's what you
expect to exist on systems where the build will be used) but
want to run the test suite with bash (e.g., since that
allows using "-x" reliably across the whole test suite).
There's currently no good way to do this.

You might think that doing two separate make invocations,
like:

  make &&
  make -C t SHELL_PATH=/bin/bash

would work. And it _almost_ does. The second make will see
our bash SHELL_PATH, and we'll use that to run the
individual test scripts (or tell prove to use it to do so).
So far so good.

But this breaks down when "--tee" or "--verbose-log" is
used. Those options cause the test script to actually
re-exec itself using $SHELL_PATH. But wait, wouldn't our
second make invocation have set SHELL_PATH correctly in the
environment?

Yes, but test-lib.sh sources GIT-BUILD-OPTIONS, which we
built during the first "make". And that overrides the
environment, giving us the original SHELL_PATH again.

Let's introduce a new variable that lets you specify a
specific shell to be run for the test scripts. Note that we
have to touch both the main and t/ Makefiles, since we have
to record it in GIT-BUILD-OPTIONS in one, and use it in the
latter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 09:03:38 -08:00
Junio C Hamano
4c6dad0059 Merge branch 'bw/protocol-v1'
A new mechanism to upgrade the wire protocol in place is proposed
and demonstrated that it works with the older versions of Git
without harming them.

* bw/protocol-v1:
  Documentation: document Extra Parameters
  ssh: introduce a 'simple' ssh variant
  i5700: add interop test for protocol transition
  http: tell server that the client understands v1
  connect: tell server that the client understands v1
  connect: teach client to recognize v1 server response
  upload-pack, receive-pack: introduce protocol version 1
  daemon: recognize hidden request arguments
  protocol: introduce protocol extension mechanisms
  pkt-line: add packet_write function
  connect: in ref advertisement, shallows are last
2017-12-06 09:23:44 -08:00
Thomas Gummerer
7c85a87c54 checkout: factor out functions to new lib file
Factor the functions out, so they can be re-used from other places.  In
particular these functions will be re-used in builtin/worktree.c to make
git worktree add dwim more.

While there add some docs to the function.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-27 09:48:06 +09:00
Jeff Hostetler
25ec7bcac0 list-objects: filter objects in traverse_commit_list
Create traverse_commit_list_filtered() and add filtering
interface to allow certain objects to be omitted from the
traversal.

Update traverse_commit_list() to be a wrapper for the above
with a null filter to minimize the number of callers that
needed to be changed.

Object filtering will be used in a future commit by rev-list
and pack-objects for partial clone and fetch to omit unwanted
objects from the result.

traverse_bitmap_commit_list() does not work with filtering.
If a packfile bitmap is present, it will not be used.  It
should be possible to extend such support in the future (at
least to simple filters that do not require object pathnames),
but that is beyond the scope of this patch series.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-22 14:11:57 +09:00
Junio C Hamano
e05336bdda Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.

* bp/fsmonitor:
  fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
  fsmonitor: read entirety of watchman output
  fsmonitor: MINGW support for watchman integration
  fsmonitor: add a performance test
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add test cases for fsmonitor extension
  split-index: disable the fsmonitor extension when running the split index test
  fsmonitor: add a test tool to dump the index extension
  update-index: add fsmonitor support to update-index
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  update-index: add a new --force-write-index option
  preload-index: add override to enable testing preload-index
  bswap: add 64 bit endianness helper get_be64
2017-11-21 14:07:50 +09:00
Junio C Hamano
40bc898103 Merge branch 'js/mingw-full-version-in-resources' into maint
MinGW updates.

* js/mingw-full-version-in-resources:
  mingw: include the full version information in the resources
2017-11-15 12:05:03 +09:00
Junio C Hamano
d3e32dc90c Merge branch 'js/mingw-full-version-in-resources'
MinGW updates.

* js/mingw-full-version-in-resources:
  mingw: include the full version information in the resources
2017-11-09 14:31:31 +09:00
Johannes Schindelin
39bb86b4e5 mingw: include the full version information in the resources
This fixes https://github.com/git-for-windows/git/issues/723

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01 13:43:52 +09:00
Brandon Williams
373d70efb2 protocol: introduce protocol extension mechanisms
Create protocol.{c,h} and provide functions which future servers and
clients can use to determine which protocol to use or is being used.

Also introduce the 'GIT_PROTOCOL' environment variable which will be
used to communicate a colon separated list of keys with optional values
to a server.  Unknown keys and values must be tolerated.  This mechanism
is used to communicate which version of the wire protocol a client would
like to use with a server.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17 10:51:29 +09:00
Junio C Hamano
54bd705a95 Merge branch 'jt/oidmap'
Introduce a new "oidmap" API and rewrite oidset to use it.

* jt/oidmap:
  oidmap: map with OID as key
2017-10-11 14:52:22 +09:00
Junio C Hamano
1a2e1a76ec Merge branch 'mh/mmap-packed-refs'
Operations that do not touch (majority of) packed refs have been
optimized by making accesses to packed-refs file lazy; we no longer
pre-parse everything, and an access to a single ref in the
packed-refs does not touch majority of irrelevant refs, either.

* mh/mmap-packed-refs: (21 commits)
  packed-backend.c: rename a bunch of things and update comments
  mmapped_ref_iterator: inline into `packed_ref_iterator`
  ref_cache: remove support for storing peeled values
  packed_ref_store: get rid of the `ref_cache` entirely
  ref_store: implement `refs_peel_ref()` generically
  packed_read_raw_ref(): read the reference from the mmapped buffer
  packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator`
  read_packed_refs(): ensure that references are ordered when read
  packed_ref_cache: keep the `packed-refs` file mmapped if possible
  packed-backend.c: reorder some definitions
  mmapped_ref_iterator_advance(): no peeled value for broken refs
  mmapped_ref_iterator: add iterator over a packed-refs file
  packed_ref_cache: remember the file-wide peeling state
  read_packed_refs(): read references with minimal copying
  read_packed_refs(): make parsing of the header line more robust
  read_packed_refs(): only check for a header at the top of the file
  read_packed_refs(): use mmap to read the `packed-refs` file
  die_unterminated_line(), die_invalid_line(): new functions
  packed_ref_cache: add a backlink to the associated `packed_ref_store`
  prefix_ref_iterator: break when we leave the prefix
  ...
2017-10-03 15:42:50 +09:00
Ben Peart
14527b3002 fsmonitor: add a performance test
Add a test utility (test-drop-caches) that flushes all changes to disk
then drops file system cache on Windows, Linux, and OSX.

Add a perf test (p7519-fsmonitor.sh) for fsmonitor.

By default, the performance test will utilize the Watchman file system
monitor if it is installed.  If Watchman is not installed, it will use a
dummy integration script that does not report any new or modified files.
The dummy script has very little overhead which provides optimistic results.

The performance test will also use the untracked cache feature if it is
available as fsmonitor uses it to speed up scanning for untracked files.

There are 4 environment variables that can be used to alter the default
behavior of the performance test:

GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache
GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
GIT_PERF_7519_FSMONITOR: used to configure core.fsmonitor
GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests

The big win for using fsmonitor is the elimination of the need to scan the
working directory looking for changed and untracked files. If the file
information is all cached in RAM, the benefits are reduced.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:05 +09:00
Ben Peart
dd3551f491 fsmonitor: add a test tool to dump the index extension
Add a test utility (test-dump-fsmonitor) that will dump the fsmonitor
index extension.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:05 +09:00
Ben Peart
883e248b8a fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:01 +09:00
Jonathan Tan
9e6fabde82 oidmap: map with OID as key
This is similar to using the hashmap in hashmap.c, but with an
easier-to-use API. In particular, custom entry comparisons no longer
need to be written, and lookups can be done without constructing a
temporary entry structure.

This is implemented as a thin wrapper over the hashmap API. In
particular, this means that there is an additional 4-byte overhead due
to the fact that the first 4 bytes of the hash is redundantly stored.
For now, I'm taking the simpler approach, but if need be, we can
reimplement oidmap without affecting the callers significantly.

oidset has been updated to use oidmap.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:18:03 +09:00
Michael Haggerty
5b633610ec packed_ref_cache: keep the packed-refs file mmapped if possible
Keep a copy of the `packed-refs` file contents in memory for as long
as a `packed_ref_cache` object is in use:

* If the system allows it, keep the `packed-refs` file mmapped.

* If not (either because the system doesn't support `mmap()` at all,
  or because a file that is currently mmapped cannot be replaced via
  `rename()`), then make a copy of the file's contents in
  heap-allocated space, and keep that around instead.

We base the choice of behavior on a new build-time switch,
`MMAP_PREVENTS_DELETE`. By default, this switch is set for Windows
variants.

After this commit, `MMAP_NONE` and `MMAP_TEMPORARY` are still handled
identically. But the next commit will introduce a difference.

This whole change is still pointless, because we only read the
`packed-refs` file contents immediately after instantiating the
`packed_ref_cache`. But that will soon change.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Junio C Hamano
a36f631ad6 Merge branch 'bw/git-clang-format'
"make style" runs git-clang-format to help developers by pointing
out coding style issues.

* bw/git-clang-format:
  Makefile: add style build rule
  clang-format: outline the git project's coding style
2017-09-25 15:24:08 +09:00
Junio C Hamano
09595ab381 Merge branch 'jk/leak-checkers'
Many of our programs consider that it is OK to release dynamic
storage that is used throughout the life of the program by simply
exiting, but this makes it harder to leak detection tools to avoid
reporting false positives.  Plug many existing leaks and introduce
a mechanism for developers to mark that the region of memory
pointed by a pointer is not lost/leaking to help these tools.

* jk/leak-checkers:
  add UNLEAK annotation for reducing leak false positives
  set_git_dir: handle feeding gitdir to itself
  repository: free fields before overwriting them
  reset: free allocated tree buffers
  reset: make tree counting less confusing
  config: plug user_config leak
  update-index: fix cache entry leak in add_one_file()
  add: free leaked pathspec after add_files_to_cache()
  test-lib: set LSAN_OPTIONS to abort by default
  test-lib: --valgrind should not override --verbose-log
2017-09-19 10:47:55 +09:00
Junio C Hamano
cb6ec86d29 Merge branch 'ti/external-sha1dc'
Platforms that ship with a separate sha1 with collision detection
library can link to it instead of using the copy we ship as part of
our source tree.

* ti/external-sha1dc:
  sha1dc: allow building with the external sha1dc library
  sha1dc: build git plumbing code more explicitly
2017-09-19 10:47:50 +09:00
Junio C Hamano
8134746d1d Merge branch 'jn/vcs-svn-cleanup' into maint
Code clean-up.

* jn/vcs-svn-cleanup:
  vcs-svn: move remaining repo_tree functions to fast_export.h
  vcs-svn: remove repo_delete wrapper function
  vcs-svn: remove custom mode constants
  vcs-svn: remove more unused prototypes and declarations
2017-09-10 17:03:09 +09:00
Jeff King
0e5bba53af add UNLEAK annotation for reducing leak false positives
It's a common pattern in git commands to allocate some
memory that should last for the lifetime of the program and
then not bother to free it, relying on the OS to throw it
away.

This keeps the code simple, and it's fast (we don't waste
time traversing structures or calling free at the end of the
program). But it also triggers warnings from memory-leak
checkers like valgrind or LSAN. They know that the memory
was still allocated at program exit, but they don't know
_when_ the leaked memory stopped being useful. If it was
early in the program, then it's probably a real and
important leak. But if it was used right up until program
exit, it's not an interesting leak and we'd like to suppress
it so that we can see the real leaks.

This patch introduces an UNLEAK() macro that lets us do so.
To understand its design, let's first look at some of the
alternatives.

Unfortunately the suppression systems offered by
leak-checking tools don't quite do what we want. A
leak-checker basically knows two things:

  1. Which blocks were allocated via malloc, and the
     callstack during the allocation.

  2. Which blocks were left un-freed at the end of the
     program (and which are unreachable, but more on that
     later).

Their suppressions work by mentioning the function or
callstack of a particular allocation, and marking it as OK
to leak.  So imagine you have code like this:

  int cmd_foo(...)
  {
	/* this allocates some memory */
	char *p = some_function();
	printf("%s", p);
	return 0;
  }

You can say "ignore allocations from some_function(),
they're not leaks". But that's not right. That function may
be called elsewhere, too, and we would potentially want to
know about those leaks.

So you can say "ignore the callstack when main calls
some_function".  That works, but your annotations are
brittle. In this case it's only two functions, but you can
imagine that the actual allocation is much deeper. If any of
the intermediate code changes, you have to update the
suppression.

What we _really_ want to say is that "the value assigned to
p at the end of the function is not a real leak". But
leak-checkers can't understand that; they don't know about
"p" in the first place.

However, we can do something a little bit tricky if we make
some assumptions about how leak-checkers work. They
generally don't just report all un-freed blocks. That would
report even globals which are still accessible when the
leak-check is run.  Instead they take some set of memory
(like BSS) as a root and mark it as "reachable". Then they
scan the reachable blocks for anything that looks like a
pointer to a malloc'd block, and consider that block
reachable. And then they scan those blocks, and so on,
transitively marking anything reachable from a global as
"not leaked" (or at least leaked in a different category).

So we can mark the value of "p" as reachable by putting it
into a variable with program lifetime. One way to do that is
to just mark "p" as static. But that actually affects the
run-time behavior if the function is called twice (you
aren't likely to call main() twice, but some of our cmd_*()
functions are called from other commands).

Instead, we can trick the leak-checker by putting the value
into _any_ reachable bytes. This patch keeps a global
linked-list of bytes copied from "unleaked" variables. That
list is reachable even at program exit, which confers
recursive reachability on whatever values we unleak.

In other words, you can do:

  int cmd_foo(...)
  {
	char *p = some_function();
	printf("%s", p);
	UNLEAK(p);
	return 0;
  }

to annotate "p" and suppress the leak report.

But wait, couldn't we just say "free(p)"? In this toy
example, yes. But UNLEAK()'s byte-copying strategy has
several advantages over actually freeing the memory:

  1. It's recursive across structures. In many cases our "p"
     is not just a pointer, but a complex struct whose
     fields may have been allocated by a sub-function. And
     in some cases (e.g., dir_struct) we don't even have a
     function which knows how to free all of the struct
     members.

     By marking the struct itself as reachable, that confers
     reachability on any pointers it contains (including those
     found in embedded structs, or reachable by walking
     heap blocks recursively.

  2. It works on cases where we're not sure if the value is
     allocated or not. For example:

       char *p = argc > 1 ? argv[1] : some_function();

     It's safe to use UNLEAK(p) here, because it's not
     freeing any memory. In the case that we're pointing to
     argv here, the reachability checker will just ignore
     our bytes.

  3. Likewise, it works even if the variable has _already_
     been freed. We're just copying the pointer bytes. If
     the block has been freed, the leak-checker will skip
     over those bytes as uninteresting.

  4. Because it's not actually freeing memory, you can
     UNLEAK() before we are finished accessing the variable.
     This is helpful in cases like this:

       char *p = some_function();
       return another_function(p);

     Writing this with free() requires:

       int ret;
       char *p = some_function();
       ret = another_function(p);
       free(p);
       return ret;

     But with unleak we can just write:

       char *p = some_function();
       UNLEAK(p);
       return another_function(p);

This patch adds the UNLEAK() macro and enables it
automatically when Git is compiled with SANITIZE=leak.  In
normal builds it's a noop, so we pay no runtime cost.

It also adds some UNLEAK() annotations to show off how the
feature works. On top of other recent leak fixes, these are
enough to get t0000 and t0001 to pass when compiled with
LSAN.

Note the case in commit.c which actually converts a
strbuf_release() into an UNLEAK. This code was already
non-leaky, but the free didn't do anything useful, since
we're exiting. Converting it to an annotation means that
non-leak-checking builds pay no runtime cost. The cost is
minimal enough that it's probably not worth going on a
crusade to convert these kinds of frees to UNLEAKS. I did it
here for consistency with the "sb" leak (though it would
have been equally correct to go the other way, and turn them
both into strbuf_release() calls).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-08 15:43:17 +09:00
Junio C Hamano
eabdcd4ab4 Merge branch 'jt/packmigrate'
Code movement to make it easier to hack later.

* jt/packmigrate: (23 commits)
  pack: move for_each_packed_object()
  pack: move has_pack_index()
  pack: move has_sha1_pack()
  pack: move find_pack_entry() and make it global
  pack: move find_sha1_pack()
  pack: move find_pack_entry_one(), is_pack_valid()
  pack: move check_pack_index_ptr(), nth_packed_object_offset()
  pack: move nth_packed_object_{sha1,oid}
  pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()
  pack: move unpack_object_header()
  pack: move get_size_from_delta()
  pack: move unpack_object_header_buffer()
  pack: move {,re}prepare_packed_git and approximate_object_count
  pack: move install_packed_git()
  pack: move add_packed_git()
  pack: move unuse_pack()
  pack: move use_pack()
  pack: move pack-closing functions
  pack: move release_pack_memory()
  pack: move open_pack_index(), parse_pack_index()
  ...
2017-08-26 22:55:09 -07:00
Junio C Hamano
030faf2fa5 Merge branch 'kw/write-index-reduce-alloc'
We used to spend more than necessary cycles allocating and freeing
piece of memory while writing each index entry out.  This has been
optimized.

* kw/write-index-reduce-alloc:
  read-cache: avoid allocating every ondisk entry when writing
  read-cache: fix memory leak in do_write_index
  perf: add test for writing the index
2017-08-26 22:55:08 -07:00
Junio C Hamano
18c88f9af6 Merge branch 'jn/vcs-svn-cleanup'
Code clean-up.

* jn/vcs-svn-cleanup:
  vcs-svn: move remaining repo_tree functions to fast_export.h
  vcs-svn: remove repo_delete wrapper function
  vcs-svn: remove custom mode constants
  vcs-svn: remove more unused prototypes and declarations
2017-08-26 22:55:06 -07:00
Jonathan Tan
4f39cd821d pack: move pack name-related functions
Currently, sha1_file.c and cache.h contain many functions, both related
to and unrelated to packfiles. This makes both files very large and
causes an unclear separation of concerns.

Create a new file, packfile.c, to hold all packfile-related functions
currently in sha1_file.c. It has a corresponding header packfile.h.

In this commit, the pack name-related functions are moved. Subsequent
commits will move the other functions.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:06 -07:00
Jonathan Nieder
b8f43b120b vcs-svn: move remaining repo_tree functions to fast_export.h
These used to be for manipulating the in-memory repo_tree structure,
but nowadays they are convenience wrappers to handle a few git-vs-svn
mismatches:

 1. Git does not track empty directories but Subversion does.  When
    looking up a path in git that Subversion thinks exists and finding
    nothing, we can safely assume that the path represents a
    directory.  This is needed when a later Subversion revision
    modifies that directory.

 2. Subversion allows deleting a file by copying.  In Git fast-import
    we have to handle that more explicitly as a deletion.

These are details of the tool's interaction with git fast-import.
Move them to fast_export.c, where other such details are handled.

This way the function names do not start with a repo_ prefix that
would clash with the repository object introduced in
v2.14.0-rc0~38^2~16 (repository: introduce the repository object,
2017-06-22) or an svn_ prefix that would clash with libsvn (in case
someone wants to link this code with libsvn some day).

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 10:41:26 -07:00