global/git

mirror of https://github.com/git/git.git synced 2026-01-21 16:06:04 +00:00

Author	SHA1	Message	Date
Jeff Hostetler	17ecafd1d2	add: use preload-index and fscache for performance Teach "add" to use preload-index and fscache features to improve performance on very large repositories. During an "add", a call is made to run_diff_files() which calls check_remove() for each index-entry. This calls lstat(). On Windows, the fscache code intercepts the lstat() calls and builds a private cache using the FindFirst/FindNext routines, which are much faster. Somewhat independent of this, is the preload-index code which distributes some of the start-up costs across multiple threads. We need to keep the call to read_cache() before parsing the pathspecs (and hence cannot use the pathspecs to limit any preload) because parse_pathspec() is using the index to determine whether a pathspec is, in fact, in a submodule. If we would not read the index first, parse_pathspec() would not error out on a path that is inside a submodule, and t7400-submodule-basic.sh would fail with not ok 47 - do not add files from a submodule We still want the nice preload performance boost, though, so we simply call read_cache_preload(&pathspecs) after parsing the pathspecs. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-24 13:46:15 +02:00
Johannes Schindelin	7a90def195	Export the preload_index() function The purpose of this function is to stat() the files listed in the index in a multi-threaded fashion. It is called directly after reading the index in the read_index_preloaded() function. However, in some cases we may want to separate the index reading from the preloading step, e.g. in builtin/add.c, where we need to load the index before we parse the pathspecs (which needs to error out if one of the pathspecs refers to a path within a submodule, for which the index must have been read already), and only then will we want to preload, possibly limited by the just-parsed pathspecs. So let's just export that function to allow calling it separately. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-24 13:46:15 +02:00
Karsten Blees	1040fdd452	fscache: load directories only once If multiple threads access a directory that is not yet in the cache, the directory will be loaded by each thread. Only one of the results is added to the cache, all others are leaked. This wastes performance and memory. On cache miss, add a future object to the cache to indicate that the directory is currently being loaded. Subsequent threads register themselves with the future object and wait. When the first thread has loaded the directory, it replaces the future object with the result and notifies waiting threads. Signed-off-by: Karsten Blees <blees@dcon.de>	2018-05-24 13:37:17 +02:00
Karsten Blees	9399af3e19	Win32: add a cache below mingw's lstat and dirent implementations Checking the work tree status is quite slow on Windows, due to slow lstat emulation (git calls lstat once for each file in the index). Windows operating system APIs seem to be much better at scanning the status of entire directories than checking single files. Add an lstat implementation that uses a cache for lstat data. Cache misses read the entire parent directory and add it to the cache. Subsequent lstat calls for the same directory are served directly from the cache. Also implement opendir / readdir / closedir so that they create and use directory listings in the cache. The cache doesn't track file system changes and doesn't plug into any modifying file APIs, so it has to be explicitly enabled for git functions that don't modify the working copy. Note: in an earlier version of this patch, the cache was always active and tracked file system changes via ReadDirectoryChangesW. However, this was much more complex and had negative impact on the performance of modifying git commands such as 'git checkout'. Signed-off-by: Karsten Blees <blees@dcon.de>	2018-05-24 13:37:17 +02:00
Karsten Blees	0ca2da05d3	add infrastructure for read-only file system level caches Add a macro to mark code sections that only read from the file system, along with a config option and documentation. This facilitates implementation of relatively simple file system level caches without the need to synchronize with the file system. Enable read-only sections for 'git status' and preload_index. Signed-off-by: Karsten Blees <blees@dcon.de>	2018-05-24 13:37:17 +02:00
Karsten Blees	9c48bb1ed3	Win32: make the lstat implementation pluggable Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite slow. Windows operating system APIs seem to be much better at scanning the status of entire directories than checking single files. A caching implementation may improve performance by bulk-reading entire directories or reusing data obtained via opendir / readdir. Make the lstat implementation pluggable so that it can be switched at runtime, e.g. based on a config option. Signed-off-by: Karsten Blees <blees@dcon.de>	2018-05-24 13:37:17 +02:00
Karsten Blees	54f5b93df9	Win32: Make the dirent implementation pluggable Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is pretty staightforward, however, most of the information provided in the WIN32_FIND_DATA structure is thrown away in the process. A more sophisticated implementation may cache this data, e.g. for later reuse in calls to lstat. Make the dirent implementation pluggable so that it can be switched at runtime, e.g. based on a config option. Define a base DIR structure with pointers to readdir/closedir that match the opendir implementation (i.e. similar to vtable pointers in OOP). Define readdir/closedir so that they call the function pointers in the DIR structure. This allows to choose the opendir implementation on a call-by-call basis. Move the fixed sized dirent.d_name buffer to the dirent-specific DIR structure, as d_name may be implementation specific (e.g. a caching implementation may just set d_name to point into the cache instead of copying the entire file name string). Signed-off-by: Karsten Blees <blees@dcon.de>	2018-05-24 13:37:17 +02:00
Karsten Blees	e77049fed9	Win32: dirent.c: Move opendir down Move opendir down in preparation for the next patch. Signed-off-by: Karsten Blees <blees@dcon.de>	2018-05-24 13:37:17 +02:00
Karsten Blees	aefabecfd1	Win32: make FILETIME conversion functions public Signed-off-by: Karsten Blees <blees@dcon.de>	2018-05-24 13:37:17 +02:00
Johannes Schindelin	674fb4862e	mingw: unset PERL5LIB by default Git for Windows ships with its own Perl interpreter, and insists on using it, so it will most likely wreak havoc if PERL5LIB is set before launching Git. Let's just unset that environment variables when spawning processes. To make this feature extensible (and overrideable), there is a new config setting `core.unsetenvvars` that allows specifying a comma-separated list of names to unset before spawning processes. Reported by Gabriel Fuhrmann. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-24 13:37:17 +02:00
Johannes Schindelin	dfbcb1e5a7	Move Windows-specific config settings into compat/mingw.c Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-24 13:37:17 +02:00
Johannes Schindelin	a96907096d	Allow for platform-specific core.* config settings In the Git for Windows project, we have ample precendent for config settings that apply to Windows, and to Windows only. Let's formalize this concept by introducing a platform_core_config() function that can be #define'd in a platform-specific manner. This will allow us to contain platform-specific code better, as the corresponding variables no longer need to be exported so that they can be defined in environment.c and be set in config.c Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-24 13:37:17 +02:00
Johannes Schindelin	0f51f6afe3	config: rename `dummy` parameter to `cb` in git_default_config() This is the convention elsewhere (and prepares for the case where we may need to pass callback data). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-24 13:37:17 +02:00
Johannes Schindelin	8a8c9484c3	Start the merging-rebase to v2.17.1 This commit starts the rebase of `74d8ecfd37` to 6b125952395	2018-05-24 13:37:12 +02:00
Johannes Schindelin	4a158e75a4	fixup! Win32: symlink: add support for symlinks to directories Fix typo. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-18 13:28:25 +02:00
Johannes Schindelin	915c2bc0ff	fixup! mingw: be very wary about outside environment changes Fix typo. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-18 13:28:09 +02:00
Johannes Schindelin	ae246e6327	fixup! Win32: symlink: add support for symlinks to directories When opening a symbolic link's target, we must take into account that the symbolic link itself might live in a directory other than the current one, and that the target may be relative. Reported by Ricky Roesler. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-18 12:41:10 +02:00
Junio C Hamano	a9693e7806	Git 2.17.1 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-18 13:57:51 +09:00
Junio C Hamano	5a8c71da80	Merge branch 'jk/submodule-name-verify-fsck' into maint * jk/submodule-name-verify-fsck: fsck: complain when .gitignore and .gitattributes are symlinks fsck: complain when .gitmodules is a symlink index-pack: check .gitmodules files with --strict unpack-objects: call fsck_finish() after fscking objects fsck: call fsck_finish() after fscking objects fsck: check .gitmodules content fsck: handle promisor objects in .gitmodules check fsck: detect gitmodules files fsck: actually fsck blob data fsck: simplify ".git" check index-pack: make fsck error message more specific	2018-05-18 13:47:28 +09:00
Junio C Hamano	47aa802379	Sync with Git 2.16.4 * maint-2.16: Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules, etc update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-18 13:46:53 +09:00
Junio C Hamano	558c52bf48	Git 2.16.4 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-18 13:38:19 +09:00
Junio C Hamano	95adc822e6	Sync with Git 2.15.2 * maint-2.15: Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules, etc update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-18 13:36:44 +09:00
Junio C Hamano	aabcf7eeb8	Git 2.15.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-18 13:29:07 +09:00
Junio C Hamano	4375a6b2d9	Sync with Git 2.14.4 * maint-2.14: Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules, etc update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-18 13:26:51 +09:00
Junio C Hamano	37ad15fded	Git 2.14.4	2018-05-18 13:09:22 +09:00
Junio C Hamano	13f57c5daa	Sync with Git 2.13.7 * maint-2.13: Git 2.13.7 verify_path: disallow symlinks in .gitmodules, etc update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-18 12:52:09 +09:00
Junio C Hamano	fd5a7c532f	Git 2.13.7 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-18 12:49:35 +09:00
Junio C Hamano	0d084b175e	Merge branch 'jk/submodule-name-verify-fix' into maint-2.13 * jk/submodule-name-verify-fix: verify_path: disallow symlinks in .gitmodules, etc update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-18 12:35:02 +09:00
Jeff King	2baa638590	fsck: complain when .gitignore and .gitattributes are symlinks This case is already forbidden by verify_path(), so let's check it in fsck. It's easier to handle than .gitmodules, because we don't care about checking the blob content. This is really just about whether the name and mode for the tree entry are valid. Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 18:11:33 -07:00
Jeff King	425db067d2	fsck: complain when .gitmodules is a symlink We've recently forbidden .gitmodules to be a symlink in verify_path(). And it's an easy way to circumvent our fsck checks for .gitmodules content. So let's complain when we see it. Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 18:09:36 -07:00
Jeff King	2a22dac050	index-pack: check .gitmodules files with --strict Now that the internal fsck code has all of the plumbing we need, we can start checking incoming .gitmodules files. Naively, it seems like we would just need to add a call to fsck_finish() after we've processed all of the objects. And that would be enough to cover the initial test included here. But there are two extra bits: 1. We currently don't bother calling fsck_object() at all for blobs, since it has traditionally been a noop. We'd actually catch these blobs in fsck_finish() at the end, but it's more efficient to check them when we already have the object loaded in memory. 2. The second pass done by fsck_finish() needs to access the objects, but we're actually indexing the pack in this process. In theory we could give the fsck code a special callback for accessing the in-pack data, but it's actually quite tricky: a. We don't have an internal efficient index mapping oids to packfile offsets. We only generate it on the fly as part of writing out the .idx file. b. We'd still have to reconstruct deltas, which means we'd basically have to replicate all of the reading logic in packfile.c. Instead, let's avoid running fsck_finish() until after we've written out the .idx file, and then just add it to our internal packed_git list. This does mean that the objects are "in the repository" before we finish our fsck checks. But unpack-objects already exhibits this same behavior, and it's an acceptable tradeoff here for the same reason: the quarantine mechanism means that pushes will be fully protected. In addition to a basic push test in t7415, we add a sneaky pack that reverses the usual object order in the pack, requiring that index-pack access the tree and blob during the "finish" step. This already works for unpack-objects (since it will have written out loose objects), but we'll check it with this sneaky pack for good measure. Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 18:04:56 -07:00
Jeff King	7fbb4d553f	unpack-objects: call fsck_finish() after fscking objects As with the previous commit, we must call fsck's "finish" function in order to catch any queued objects for .gitmodules checks. This second pass will be able to access any incoming objects, because we will have exploded them to loose objects by now. This isn't quite ideal, because it means that bad objects may have been written to the object database (and a subsequent operation could then reference them, even if the other side doesn't send the objects again). However, this is sufficient when used with receive.fsckObjects, since those loose objects will all be placed in a temporary quarantine area that will get wiped if we find any problems. Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 18:04:56 -07:00
Jeff King	6db6dbfb66	fsck: call fsck_finish() after fscking objects Now that the internal fsck code is capable of checking .gitmodules files, we just need to teach its callers to use the "finish" function to check any queued objects. With this, we can now catch the malicious case in t7415 with git-fsck. Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 18:04:56 -07:00
Jeff King	d93d55d482	fsck: check .gitmodules content This patch detects and blocks submodule names which do not match the policy set forth in submodule-config. These should already be caught by the submodule code itself, but putting the check here means that newer versions of Git can protect older ones from malicious entries (e.g., a server with receive.fsckObjects will block the objects, protecting clients which fetch from it). As a side effect, this means fsck will also complain about .gitmodules files that cannot be parsed (or were larger than core.bigFileThreshold). Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 18:04:50 -07:00
Jeff King	38433731ad	fsck: handle promisor objects in .gitmodules check If we have a tree that points to a .gitmodules blob but don't have that blob, we can't check its contents. This produces an fsck error when we encounter it. But in the case of a promisor object, this absence is expected, and we must not complain. Note that this can technically circumvent our transfer.fsckObjects check. Imagine a client fetches a tree, but not the matching .gitmodules blob. An fsck of the incoming objects will show that we don't have enough information. Later, we do fetch the actual blob. But we have no idea that it's a .gitmodules file. The only ways to get around this would be to re-scan all of the existing trees whenever new ones enter (which is expensive), or to somehow persist the gitmodules_found set between fsck runs (which is complicated). In practice, it's probably OK to ignore the problem. Any repository which has all of the objects (including the one serving the promisor packs) can perform the checks. Since promisor packs are inherently about a hierarchical topology in which clients rely on upstream repositories, those upstream repositories can protect all of their downstream clients from broken objects. Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 18:03:21 -07:00
Jeff King	4a9e4c2e71	fsck: detect gitmodules files In preparation for performing fsck checks on .gitmodules files, this commit plumbs in the actual detection of the files. Note that unlike most other fsck checks, this cannot be a property of a single object: we must know that the object is found at a ".gitmodules" path at the root tree of a commit. Since the fsck code only sees one object at a time, we have to mark the related objects to fit the puzzle together. When we see a commit we mark its tree as a root tree, and when we see a root tree with a .gitmodules file, we mark the corresponding blob to be checked. In an ideal world, we'd check the objects in topological order: commits followed by trees followed by blobs. In that case we can avoid ever loading an object twice, since all markings would be complete by the time we get to the marked objects. And indeed, if we are checking a single packfile, this is the order in which Git will generally write the objects. But we can't count on that: 1. git-fsck may show us the objects in arbitrary order (loose objects are fed in sha1 order, but we may also have multiple packs, and we process each pack fully in sequence). 2. The type ordering is just what git-pack-objects happens to write now. The pack format does not require a specific order, and it's possible that future versions of Git (or a custom version trying to fool official Git's fsck checks!) may order it differently. 3. We may not even be fscking all of the relevant objects at once. Consider pushing with transfer.fsckObjects, where one push adds a blob at path "foo", and then a second push adds the same blob at path ".gitmodules". The blob is not part of the second push at all, but we need to mark and check it. So in the general case, we need to make up to three passes over the objects: once to make sure we've seen all commits, then once to cover any trees we might have missed, and then a final pass to cover any .gitmodules blobs we found in the second pass. We can simplify things a bit by loosening the requirement that we find .gitmodules only at root trees. Technically a file like "subdir/.gitmodules" is not parsed by Git, but it's not unreasonable for us to declare that Git is aware of all ".gitmodules" files and make them eligible for checking. That lets us drop the root-tree requirement, which eliminates one pass entirely. And it makes our worst case much better: instead of potentially queueing every root tree to be re-examined, the worst case is that we queue each unique .gitmodules blob for a second look. This patch just adds the boilerplate to find .gitmodules files. The actual content checks will come in a subsequent commit. Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 18:03:19 -07:00
Jeff King	7075cedbcc	fsck: actually fsck blob data Because fscking a blob has always been a noop, we didn't bother passing around the blob data. In preparation for content-level checks, let's fix up a few things: 1. The fsck_object() function just returns success for any blob. Let's a noop fsck_blob(), which we can fill in with actual logic later. 2. The fsck_loose() function in builtin/fsck.c just threw away blob content after loading it. Let's hold onto it until after we've called fsck_object(). The easiest way to do this is to just drop the parse_loose_object() helper entirely. Incidentally, this also fixes a memory leak: if we successfully loaded the object data but did not parse it, we would have left the function without freeing it. 3. When fsck_loose() loads the object data, it does so with a custom read_loose_object() helper. This function streams any blobs, regardless of size, under the assumption that we're only checking the sha1. Instead, let's actually load blobs smaller than big_file_threshold, as the normal object-reading code-paths would do. This lets us fsck small files, and a NULL return is an indication that the blob was so big that it needed to be streamed, and we can pass that information along to fsck_blob(). Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 17:59:40 -07:00
Johannes Schindelin	b911995991	Merge branch 'colorize-push-errors' To help users discern large chunks of white text (when the push succeeds) from large chunks of white text (when the push fails), let's add some color to the latter. This closes https://github.com/git-for-windows/git/pull/1429 and fixes https://github.com/git-for-windows/git/issues/1422 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-17 23:31:58 +02:00
Johannes Schindelin	e6c415e4e1	Document the new color.* settings to colorize push errors/hints Let's make it easier for users to find out how to customize these colors. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-17 23:22:37 +02:00
Johannes Schindelin	0a99d05884	Add a test to verify that push errors are colorful This actually only tests whether the push errors/hints are colored if the respective color.* config settings are `always`, but in the regular case they default to `auto` (in which case we color the messages when stderr is connected to an interactive terminal), therefore these tests should suffice. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-17 23:22:37 +02:00
Ryan Dammrose	6379dc63ad	push: colorize errors This is an attempt to resolve an issue I experience with people that are new to Git -- especially colleagues in a team setting -- where they miss that their push to a remote location failed because the failure and success both return a block of white text. An example is if I push something to a remote repository and then a colleague attempts to push to the same remote repository and the push fails because it requires them to pull first, but they don't notice because a success and failure both return a block of white text. They then continue about their business, thinking it has been successfully pushed. This patch colorizes the errors and hints (in red and yellow, respectively) so whenever there is a failure when pushing to a remote repository that fails, it is more noticeable. [jes: fixed a couple bugs, added the color.{advice,push,transport} settings, refactored to use want_color_stderr().] Signed-off-by: Ryan Dammrose ryandammrose@gmail.com Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-17 23:21:58 +02:00
Johannes Schindelin	3823662341	color: introduce support for colorizing stderr So far, we only ever asked whether stdout wants to be colorful. In the upcoming patches, we will want to make push errors more prominent, which are printed to stderr, though. So let's refactor the want_color() function into a want_color_fd() function (which expects to be called with fd == 1 or fd == 2 for stdout and stderr, respectively), and then define the macro `want_color()` to use the want_color_fd() function. And then also add a macro `want_color_stderr()`, for convenience and for documentation. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-17 23:11:26 +02:00
Johannes Schindelin	6c1fc6a76b	Merge branch 'dj/runtime-prefix' Two more commits made it into the dj/runtime-prefix branch before being merged into core Git's `master`. Let's take those two, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2018-05-17 23:04:49 +02:00
Jonathan Nieder	8ead2533e8	Makefile: quote $INSTLIBDIR when passing it to sed `f6a0ad4b` (Makefile: generate Perl header from template file, 2018-04-10) moved code for generating the 'use lib' lines at the top of perl scripts from the $(SCRIPT_PERL_GEN) rule to a separate GIT-PERL-HEADER rule. This rule first populates INSTLIBDIR and then substitutes it into the GIT-PERL-HEADER using sed: INSTLIBDIR=... something ... sed -e 's=@@INSTLIBDIR@@='$$INSTLIBDIR'=g' $< > $@ Because $INSTLIBDIR is not surrounded by double quotes, the shell splits it at each space, causing errors if INSTLIBDIR contains an $IFS character: sed: 1: "s=@@INSTLIBDIR@@=/usr/l ...": unescaped newline inside substitute pattern Add back the missing double-quotes to make it work again. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-17 23:00:45 +02:00
Jonathan Nieder	27b491972e	Makefile: remove unused @@PERLLIBDIR@@ substitution variable Junio noticed that this variable is not quoted correctly when it is passed to sed. As a shell-quoted string, it should be inside single-quotes like $(perllibdir_relative_SQ), not outside them like $INSTLIBDIR. In fact, this substitution variable is not used. Simplify by removing it. Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-17 23:00:45 +02:00
Jeff King	60d36c3717	fsck: simplify ".git" check There's no need for us to manually check for ".git"; it's a subset of the other filesystem-specific tests. Dropping it makes our code slightly shorter. More importantly, the existing code may make a reader wonder why ".GIT" is not covered here, and whether that is a bug (it isn't, as it's also covered in the filesystem-specific tests). Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 12:17:32 -07:00
Jeff King	8ae9fe61a3	index-pack: make fsck error message more specific If fsck reports an error, we say only "Error in object". This isn't quite as bad as it might seem, since the fsck code would have dumped some errors to stderr already. But it might help to give a little more context. The earlier output would not have even mentioned "fsck", and that may be a clue that the "fsck." or ".fsckObjects" config may be relevant. Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 12:17:32 -07:00
Jeff King	97f0d2bce3	Merge branch 'jk/submodule-name-verify-fix' into jk/submodule-name-verify-fsck * jk/submodule-name-verify-fix: verify_path: disallow symlinks in .gitmodules, etc update-index: stat updated files earlier verify_path: drop clever fallthrough skip_prefix: add icase-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests path: match NTFS short names for more .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths Note that this includes two bits of evil-merge: - there's a new call to verify_path() that doesn't actually have a mode available. It should be OK to pass "0" here, since we're just manipulating the untracked cache, not an actual index entry. - the lstat() in builtin/update-index.c:update_one() needs to be updated to handle the fsmonitor case (without this it still behaves correctly, but does an unnecessary lstat).	2018-05-17 12:15:47 -07:00
Jeff King	8423fd8281	verify_path: disallow symlinks in .gitmodules, etc There are a few reasons it's not a good idea to make .git* files symlinks, including: 1. It won't be portable to systems without symlinks. 2. It may behave inconsistently, since Git internally may look at these files from the index or a tree without bothering to resolve any symbolic links. So it may work in some settings (where we read from the filesystem) but not in others). With some clever code, we could make (2) work. And some people may not care about (1) if they only work on one platform. But there are a few security reasons to simply disallow symlinked meta-files: a. A symlinked .gitmodules file may circumvent any fsck checks of the content. b. Git may read and write from the on-disk file without sanity checking the symlink target. So for example, if you link ".gitmodules" to "../oops" and run "git submodule add", we'll write to the file "oops" outside the repository. Again, both of those are problems that _could_ be solved with sufficient code, but given the current inconsistent behavior and unportability, we're better off just outlawing it explicitly. We'll give the same treatment to .gitmodules, .gitignore, and .gitattributes. The latter two cannot be used to write outside the repository (we write them only as part of a checkout, where we are careful not to follow any symlinks). But they can still cause a "git clone && git log" combination to read arbitrary files outside the filesystem. There's _probably_ nothing too harmful you can do with that, but it seems questionable (and anyway, they suffer from the same portability and consistency problems). Note the slightly tricky call to verify_path() in update-index's update_one(). There we may not have a mode if we're not updating from the filesystem (e.g., we might just be removing the file). Passing "0" as the mode there works fine; since it's not a symlink, we'll just skip the extra checks. Signed-off-by: Jeff King <peff@peff.net>	2018-05-17 11:22:05 -07:00
Jeff King	3f97adce0d	update-index: stat updated files earlier In the update_one(), we check verify_path() on the proposed path before doing anything else. In preparation for having verify_path() look at the file mode, let's stat the file earlier, so we can check the mode accurately. This is made a bit trickier by the fact that this function only does an lstat in a few code paths (the ones that flow down through process_path()). So we can speculatively do the lstat() here and pass the results down, and just use a dummy mode for cases where we won't actually be updating the index from the filesystem.	2018-05-17 11:22:05 -07:00

1 2 3 4 5 ...

82993 Commits