This is an attempt to resolve an issue I experience with people that are
new to Git -- especially colleagues in a team setting -- where they miss
that their push to a remote location failed because the failure and
success both return a block of white text.
An example is if I push something to a remote repository and then a
colleague attempts to push to the same remote repository and the push
fails because it requires them to pull first, but they don't notice
because a success and failure both return a block of white text. They
then continue about their business, thinking it has been successfully
pushed.
This patch colorizes the errors and hints (in red and yellow,
respectively) so whenever there is a failure when pushing to a remote
repository that fails, it is more noticeable.
[jes: fixed a couple bugs, added the color.{advice,push,transport}
settings, refactored to use want_color_stderr().]
Signed-off-by: Ryan Dammrose ryandammrose@gmail.com
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is retry of #1419.
I added flush_fscache macro to flush cached stats after disk writing
with tests for regression reported in #1438 and #1442.
git checkout checks each file path in sorted order, so cache flushing does not
make performance worse unless we have large number of modified files in
a directory containing many files.
Using chromium repository, I tested `git checkout .` performance when I
delete 10 files in different directories.
With this patch:
TotalSeconds: 4.307272
TotalSeconds: 4.4863595
TotalSeconds: 4.2975562
Avg: 4.36372923333333
Without this patch:
TotalSeconds: 20.9705431
TotalSeconds: 22.4867685
TotalSeconds: 18.8968292
Avg: 20.7847136
I confirmed this patch passed all tests in t/ with core_fscache=1.
Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
If there is already a .git/ADD_EDIT.patch file, we fail to truncate it
properly, which could result in very funny errors.
Let's just truncate it and not worry about it too much.
Reported by J Wyman.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git for Windows supports the core.longPaths config setting to allow
writing/reading long paths via the \\?\ trick for a long time now.
However, for that support to work, it is absolutely necessary that
git_default_config() is given a chance to parse the config. Otherwise
Git will be non the wiser.
So let's make sure that as many commands that previously failed to
parse the core.* settings now do that, implicitly enabling long path
support in a lot more places.
Note: this is not a perfect solution, and it cannot be, as there is
a chicken-and-egg problem in reading the config itself...
This fixes https://github.com/git-for-windows/git/issues/1218
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A ton of Git commands simply do not read (or at least parse) the core.*
settings. This is not good, as Git for Windows relies on the
core.longPaths setting to be read quite early on.
So let's just make sure that all commands read the config and give
platform_core_config() a chance.
This patch teaches tons of Git commands to respect the config setting
`core.longPaths = true`, including `pack-refs`, thereby fixing
https://github.com/git-for-windows/git/issues/1218
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It was a bad idea to just remove that option from Git for Windows
v2.15.0, as early users of that (still experimental) option would have
been puzzled what they are supposed to do now.
So let's reintroduce the flag, but make sure to show the user good
advice how to fix this going forward.
We'll remove this option in a more orderly fashion either in v2.16.0 or
in v2.17.0.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch adds the (experimental) --stdin/-z options to `git
reset`. Those patches are still under review in the upstream Git project,
but are already merged in their experimental form into Git for Windows'
`master` branch, in preparation for a MinGit-only release.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach "add" to use preload-index and fscache features
to improve performance on very large repositories.
During an "add", a call is made to run_diff_files()
which calls check_remove() for each index-entry. This
calls lstat(). On Windows, the fscache code intercepts
the lstat() calls and builds a private cache using the
FindFirst/FindNext routines, which are much faster.
Somewhat independent of this, is the preload-index code
which distributes some of the start-up costs across
multiple threads.
We need to keep the call to read_cache() before parsing the
pathspecs (and hence cannot use the pathspecs to limit any preload)
because parse_pathspec() is using the index to determine whether a
pathspec is, in fact, in a submodule. If we would not read the index
first, parse_pathspec() would not error out on a path that is inside
a submodule, and t7400-submodule-basic.sh would fail with
not ok 47 - do not add files from a submodule
We still want the nice preload performance boost, though, so we simply
call read_cache_preload(&pathspecs) after parsing the pathspecs.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It has been reported that core.hideDotFiles=false stopped working...
This topic branch fixes it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This branch allows third-party tools to call `git status
--no-lock-index` to avoid lock contention with the interactive Git usage
of the actual human user.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These fixes were necessary for Sverre Rabbelier's remote-hg to work,
but for some magic reason they are not necessary for the current
remote-hg. Makes you wonder how that one gets away with it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Just like with other Git commands, this option makes it read the paths
from the standard input. It comes in handy when resetting many, many
paths at once and wildcards are not an option (e.g. when the paths are
generated by a tool).
Note: we first parse the entire list and perform the actual reset action
only in a second phase. Not only does this make things simpler, it also
helps performance, as do_diff_cache() traverses the index and the
(sorted) pathspecs in simultaneously to avoid unnecessary lookups.
This feature is marked experimental because it is still under review in
the upstream Git project.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is a brown paper bag. When adding the tests, we actually failed
to verify that the config variable is heeded in git-init at all. And
when changing the original patch that marked the .git/ directory as
hidden after reading the config, it was lost on this developer that
the new code would use the hide_dotfiles variable before the config
was read.
The fix is obvious: read the (limited, pre-init) config *before*
creating the .git/ directory.
This fixes https://github.com/git-for-windows/git/issues/789
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When a third-party tool periodically runs `git status` in order to keep
track of the state of the working tree, it is a bad idea to lock the
index: it might interfere with interactive commands executed by the
user, e.g. when the user wants to commit files.
Git for Windows introduced the `--no-lock-index` option a long time ago
to fix that (it made it into Git for Windows v2.9.2(3)) by simply
avoiding to write that file.
The downside is that the periodic `git status` calls will be a little
bit more wasteful because they may have to refresh the index repeatedly,
only to throw away the updates when it exits. This cannot really be
helped, though, as tools wanting to get a periodic update of the status
have no way to predict when the user may want to lock the index herself.
Sadly, a competing approach was submitted (by somebody who apparently
has less work on their plate than this maintainer) that made it into
v2.15.0 but is *different*: instead of a `git status`-only option, it is
an option that comes *before* the Git command and is called differently,
too.
Let's give previous users a chance to upgrade to newer Git for Windows
versions by handling the `--no-lock-index` option, still, though with a
big fat warning.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, files cannot be removed nor renamed if there are still
handles held by a process. To remedy that, we introduced the
close_all_packs() function.
Earlier, we made sure that the packs are released just before `git gc`
is spawned, in case that gc wants to remove no-longer needed packs.
But this developer forgot that gc itself also needs to let go of packs,
e.g. when consolidating all packs via the --aggressive option.
Likewise, `git repack -d` wants to delete obsolete packs and therefore
needs to close all pack handles, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
When calling `git fast-export a..a b` when a and b refer to the same
commit, nothing would be exported, and an incorrect reset line would
be printed for b ('from :0').
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
For performance reasons `stdout` is not unbuffered by default. That leads
to problems if after printing to `stdout` a read on `stdin` is performed.
For that reason interactive commands like `git clean -i` do not function
properly anymore if the `stdout` is not flushed by `fflush(stdout)` before
trying to read from `stdin`.
In the case of `git clean -i` all reads on `stdin` were preceded by a
`fflush(stdout)` call.
Signed-off-by: nalla <nalla@hamal.uberspace.de>
Now that the internal fsck code has all of the plumbing we
need, we can start checking incoming .gitmodules files.
Naively, it seems like we would just need to add a call to
fsck_finish() after we've processed all of the objects. And
that would be enough to cover the initial test included
here. But there are two extra bits:
1. We currently don't bother calling fsck_object() at all
for blobs, since it has traditionally been a noop. We'd
actually catch these blobs in fsck_finish() at the end,
but it's more efficient to check them when we already
have the object loaded in memory.
2. The second pass done by fsck_finish() needs to access
the objects, but we're actually indexing the pack in
this process. In theory we could give the fsck code a
special callback for accessing the in-pack data, but
it's actually quite tricky:
a. We don't have an internal efficient index mapping
oids to packfile offsets. We only generate it on
the fly as part of writing out the .idx file.
b. We'd still have to reconstruct deltas, which means
we'd basically have to replicate all of the
reading logic in packfile.c.
Instead, let's avoid running fsck_finish() until after
we've written out the .idx file, and then just add it
to our internal packed_git list.
This does mean that the objects are "in the repository"
before we finish our fsck checks. But unpack-objects
already exhibits this same behavior, and it's an
acceptable tradeoff here for the same reason: the
quarantine mechanism means that pushes will be
fully protected.
In addition to a basic push test in t7415, we add a sneaky
pack that reverses the usual object order in the pack,
requiring that index-pack access the tree and blob during
the "finish" step.
This already works for unpack-objects (since it will have
written out loose objects), but we'll check it with this
sneaky pack for good measure.
Signed-off-by: Jeff King <peff@peff.net>
As with the previous commit, we must call fsck's "finish"
function in order to catch any queued objects for
.gitmodules checks.
This second pass will be able to access any incoming
objects, because we will have exploded them to loose objects
by now.
This isn't quite ideal, because it means that bad objects
may have been written to the object database (and a
subsequent operation could then reference them, even if the
other side doesn't send the objects again). However, this is
sufficient when used with receive.fsckObjects, since those
loose objects will all be placed in a temporary quarantine
area that will get wiped if we find any problems.
Signed-off-by: Jeff King <peff@peff.net>
Now that the internal fsck code is capable of checking
.gitmodules files, we just need to teach its callers to use
the "finish" function to check any queued objects.
With this, we can now catch the malicious case in t7415 with
git-fsck.
Signed-off-by: Jeff King <peff@peff.net>
Because fscking a blob has always been a noop, we didn't
bother passing around the blob data. In preparation for
content-level checks, let's fix up a few things:
1. The fsck_object() function just returns success for any
blob. Let's a noop fsck_blob(), which we can fill in
with actual logic later.
2. The fsck_loose() function in builtin/fsck.c
just threw away blob content after loading it. Let's
hold onto it until after we've called fsck_object().
The easiest way to do this is to just drop the
parse_loose_object() helper entirely. Incidentally,
this also fixes a memory leak: if we successfully
loaded the object data but did not parse it, we would
have left the function without freeing it.
3. When fsck_loose() loads the object data, it
does so with a custom read_loose_object() helper. This
function streams any blobs, regardless of size, under
the assumption that we're only checking the sha1.
Instead, let's actually load blobs smaller than
big_file_threshold, as the normal object-reading
code-paths would do. This lets us fsck small files, and
a NULL return is an indication that the blob was so big
that it needed to be streamed, and we can pass that
information along to fsck_blob().
Signed-off-by: Jeff King <peff@peff.net>
If fsck reports an error, we say only "Error in object".
This isn't quite as bad as it might seem, since the fsck
code would have dumped some errors to stderr already. But it
might help to give a little more context. The earlier output
would not have even mentioned "fsck", and that may be a clue
that the "fsck.*" or "*.fsckObjects" config may be relevant.
Signed-off-by: Jeff King <peff@peff.net>
* jk/submodule-name-verify-fix:
verify_path: disallow symlinks in .gitmodules
update-index: stat updated files earlier
verify_path: drop clever fallthrough
skip_prefix: add icase-insensitive variant
is_{hfs,ntfs}_dotgitmodules: add tests
path: match NTFS short names for more .git files
is_hfs_dotgit: match other .git files
is_ntfs_dotgit: use a size_t for traversing string
submodule-config: verify submodule names as paths
Note that this includes two bits of evil-merge:
- there's a new call to verify_path() that doesn't actually
have a mode available. It should be OK to pass "0" here,
since we're just manipulating the untracked cache, not an
actual index entry.
- the lstat() in builtin/update-index.c:update_one() needs
to be updated to handle the fsmonitor case (without this
it still behaves correctly, but does an unnecessary
lstat).
There are a few reasons it's not a good idea to make
.gitmodules a symlink, including:
1. It won't be portable to systems without symlinks.
2. It may behave inconsistently, since Git may look at
this file in the index or a tree without bothering to
resolve any symbolic links. We don't do this _yet_, but
the config infrastructure is there and it's planned for
the future.
With some clever code, we could make (2) work. And some
people may not care about (1) if they only work on one
platform. But there are a few security reasons to simply
disallow it:
a. A symlinked .gitmodules file may circumvent any fsck
checks of the content.
b. Git may read and write from the on-disk file without
sanity checking the symlink target. So for example, if
you link ".gitmodules" to "../oops" and run "git
submodule add", we'll write to the file "oops" outside
the repository.
Again, both of those are problems that _could_ be solved
with sufficient code, but given the complications in (1) and
(2), we're better off just outlawing it explicitly.
Note the slightly tricky call to verify_path() in
update-index's update_one(). There we may not have a mode if
we're not updating from the filesystem (e.g., we might just
be removing the file). Passing "0" as the mode there works
fine; since it's not a symlink, we'll just skip the extra
checks.
Signed-off-by: Jeff King <peff@peff.net>
In the update_one(), we check verify_path() on the proposed
path before doing anything else. In preparation for having
verify_path() look at the file mode, let's stat the file
earlier, so we can check the mode accurately.
This is made a bit trickier by the fact that this function
only does an lstat in a few code paths (the ones that flow
down through process_path()). So we can speculatively do the
lstat() here and pass the results down, and just use a dummy
mode for cases where we won't actually be updating the index
from the filesystem.
Signed-off-by: Jeff King <peff@peff.net>
Submodule "names" come from the untrusted .gitmodules file,
but we blindly append them to $GIT_DIR/modules to create our
on-disk repo paths. This means you can do bad things by
putting "../" into the name (among other things).
Let's sanity-check these names to avoid building a path that
can be exploited. There are two main decisions:
1. What should the allowed syntax be?
It's tempting to reuse verify_path(), since submodule
names typically come from in-repo paths. But there are
two reasons not to:
a. It's technically more strict than what we need, as
we really care only about breaking out of the
$GIT_DIR/modules/ hierarchy. E.g., having a
submodule named "foo/.git" isn't actually
dangerous, and it's possible that somebody has
manually given such a funny name.
b. Since we'll eventually use this checking logic in
fsck to prevent downstream repositories, it should
be consistent across platforms. Because
verify_path() relies on is_dir_sep(), it wouldn't
block "foo\..\bar" on a non-Windows machine.
2. Where should we enforce it? These days most of the
.gitmodules reads go through submodule-config.c, so
I've put it there in the reading step. That should
cover all of the C code.
We also construct the name for "git submodule add"
inside the git-submodule.sh script. This is probably
not a big deal for security since the name is coming
from the user anyway, but it would be polite to remind
them if the name they pick is invalid (and we need to
expose the name-checker to the shell anyway for our
test scripts).
This patch issues a warning when reading .gitmodules
and just ignores the related config entry completely.
This will generally end up producing a sensible error,
as it works the same as a .gitmodules file which is
missing a submodule entry (so "submodule update" will
barf, but "git clone --recurse-submodules" will print
an error but not abort the clone.
There is one minor oddity, which is that we print the
warning once per malformed config key (since that's how
the config subsystem gives us the entries). So in the
new test, for example, the user would see three
warnings. That's OK, since the intent is that this case
should never come up outside of malicious repositories
(and then it might even benefit the user to see the
message multiple times).
Credit for finding this vulnerability and the proof of
concept from which the test script was adapted goes to
Etienne Stalmans.
Signed-off-by: Jeff King <peff@peff.net>
This fixes a regression introduced in 2e612731b5 (submodule: port
submodule subcommand 'deinit' from shell to C, 2018-01-15), when
handling pathspecs that do not exist gracefully. This restores the
historic behavior of reporting the pathspec as unknown and returning
instead of reporting a bug.
Reported-by: Peter Oberndorfer <kumbayo84@arcor.de>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The transfer.fsckobjects configuration tells "git fetch" to
validate the data and connected-ness of objects in the received
pack; the code to perform this check has been taught about the
narrow clone's convention that missing objects that are reachable
from objects in a pack that came from a promissor remote is OK.
* jt/transfer-fsck-with-promissor:
fetch-pack: do not check links for partial fetch
index-pack: support checking objects but not links
Internal API clean-up to allow write_locked_index() optionally skip
writing the in-core index when it is not modified.
* ma/skip-writing-unchanged-index:
write_locked_index(): add flag to avoid writing unchanged index
In a way similar to how "git tag" learned to honor the pager
setting only in the list mode, "git config" learned to ignore the
pager setting when it is used for setting values (i.e. when the
purpose of the operation is not to "show").
* ma/config-page-only-in-list-mode:
config: change default of `pager.config` to "on"
config: respect `pager.config` in list/get-mode only
t7006: add tests for how git config paginates
The 'self-initialised' variables construct (ie <type> var = var;) has
been used to silence gcc '-W[maybe-]uninitialized' warnings. This has,
unfortunately, caused MSVC to issue 'uninitialized variable' warnings.
Also, using clang static analysis causes complaints about an 'Assigned
value is garbage or undefined'.
There are six such constructs in the current codebase. Only one of the
six causes gcc to issue a '-Wmaybe-uninitialized' warning (which will
be addressed elsewhere). The remaining five 'init-self' gcc workarounds
are noted below, along with the commit which introduced them:
1. builtin/rev-list.c: 'reaches' and 'all', see commit 457f08a030
("git-rev-list: add --bisect-vars option.", 2007-03-21).
2. merge-recursive.c:2064 'mrtree', see commit f120ae2a8e ("merge-
recursive.c: mrtree in merge() is not used before set", 2007-10-29).
3. fast-import.c:3023 'oe', see commit 85c62395b1 ("fast-import: let
importers retrieve blobs", 2010-11-28).
4. fast-import.c:3006 'oe', see commit 28c7b1f7b7 ("fast-import: add a
get-mark command", 2015-07-01).
Remove the 'self-initialised' variable constructs noted above.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index-pack command currently supports the
--check-self-contained-and-connected argument, for internal use only,
that instructs it to only check for broken links and not broken objects.
For partial clones, we need the inverse, so add a --fsck-objects
argument that checks for broken objects and not broken links, also for
internal use only.
This will be used by fetch-pack in a subsequent patch.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach parse-options API an option to help the completion script,
and make use of the mechanism in command line completion.
* nd/parseopt-completion: (45 commits)
completion: more subcommands in _git_notes()
completion: complete --{reuse,reedit}-message= for all notes subcmds
completion: simplify _git_notes
completion: don't set PARSE_OPT_NOCOMPLETE on --rerere-autoupdate
completion: use __gitcomp_builtin in _git_worktree
completion: use __gitcomp_builtin in _git_tag
completion: use __gitcomp_builtin in _git_status
completion: use __gitcomp_builtin in _git_show_branch
completion: use __gitcomp_builtin in _git_rm
completion: use __gitcomp_builtin in _git_revert
completion: use __gitcomp_builtin in _git_reset
completion: use __gitcomp_builtin in _git_replace
remote: force completing --mirror= instead of --mirror
completion: use __gitcomp_builtin in _git_remote
completion: use __gitcomp_builtin in _git_push
completion: use __gitcomp_builtin in _git_pull
completion: use __gitcomp_builtin in _git_notes
completion: use __gitcomp_builtin in _git_name_rev
completion: use __gitcomp_builtin in _git_mv
completion: use __gitcomp_builtin in _git_merge_base
...
"git worktree" learned move and remove subcommands.
* nd/worktree-move:
t2028: fix minor error and issues in newly-added "worktree move" tests
worktree remove: allow it when $GIT_WORK_TREE is already gone
worktree remove: new command
worktree move: refuse to move worktrees with submodules
worktree move: accept destination as directory
worktree move: new command
worktree.c: add update_worktree_location()
worktree.c: add validate_worktree()
"git commit" used to run "gc --auto" near the end, which was lost
when the command was reimplemented in C by mistake.
* ab/gc-auto-in-commit:
commit: run git gc --auto just before the post-commit hook
Threaded "git grep" has been optimized to avoid allocation in code
section that is covered under a mutex.
* rv/grep-cleanup:
grep: simplify grep_oid and grep_file
grep: move grep_source_init outside critical section
"git status" can spend a lot of cycles to compute the relation
between the current branch and its upstream, which can now be
disabled with "--no-ahead-behind" option.
* jh/status-no-ahead-behind:
status: support --no-ahead-behind in long format
status: update short status to respect --no-ahead-behind
status: add --[no-]ahead-behind to status and commit for V2 format.
stat_tracking_info: return +1 when branches not equal