Commit Graph

82983 Commits

Author SHA1 Message Date
Johannes Schindelin
4401b3b7bd mingw: use domain information for default email
When a user is registered in a Windows domain, it is really easy to
obtain the email address. So let's do that.

Suggested by Lutz Roeder.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-24 13:37:17 +02:00
Johannes Schindelin
cbeca6bee2 getpwuid(mingw): provide a better default for the user name
We do have the excellent GetUserInfoEx() function to obtain more
detailed information of the current user (if the user is part of a
Windows domain); Let's use it.

Suggested by Lutz Roeder.

To avoid the cost of loading Secur32.dll (even lazily, loading DLLs
takes a non-neglibile amount of time), we use the established technique
to load DLLs only when, and if, needed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-24 13:37:17 +02:00
Johannes Schindelin
3d59383fed getpwuid(mingw): initialize the structure only once
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-24 13:37:17 +02:00
Johannes Schindelin
8a8c9484c3 Start the merging-rebase to v2.17.1
This commit starts the rebase of 74d8ecfd37 to 6b125952395
2018-05-24 13:37:12 +02:00
Johannes Schindelin
4a158e75a4 fixup! Win32: symlink: add support for symlinks to directories
Fix typo.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-18 13:28:25 +02:00
Johannes Schindelin
915c2bc0ff fixup! mingw: be *very* wary about outside environment changes
Fix typo.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-18 13:28:09 +02:00
Johannes Schindelin
ae246e6327 fixup! Win32: symlink: add support for symlinks to directories
When opening a symbolic link's target, we must take into account that
the symbolic link itself might live in a directory other than the
current one, and that the target may be relative.

Reported by Ricky Roesler.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-18 12:41:10 +02:00
Junio C Hamano
a9693e7806 Git 2.17.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 13:57:51 +09:00
Junio C Hamano
5a8c71da80 Merge branch 'jk/submodule-name-verify-fsck' into maint
* jk/submodule-name-verify-fsck:
  fsck: complain when .gitignore and .gitattributes are symlinks
  fsck: complain when .gitmodules is a symlink
  index-pack: check .gitmodules files with --strict
  unpack-objects: call fsck_finish() after fscking objects
  fsck: call fsck_finish() after fscking objects
  fsck: check .gitmodules content
  fsck: handle promisor objects in .gitmodules check
  fsck: detect gitmodules files
  fsck: actually fsck blob data
  fsck: simplify ".git" check
  index-pack: make fsck error message more specific
2018-05-18 13:47:28 +09:00
Junio C Hamano
47aa802379 Sync with Git 2.16.4
* maint-2.16:
  Git 2.16.4
  Git 2.15.2
  Git 2.14.4
  Git 2.13.7
  verify_path: disallow symlinks in .gitmodules, etc
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  is_ntfs_dotgit: match other .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths
2018-05-18 13:46:53 +09:00
Junio C Hamano
558c52bf48 Git 2.16.4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 13:38:19 +09:00
Junio C Hamano
95adc822e6 Sync with Git 2.15.2
* maint-2.15:
  Git 2.15.2
  Git 2.14.4
  Git 2.13.7
  verify_path: disallow symlinks in .gitmodules, etc
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  is_ntfs_dotgit: match other .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths
2018-05-18 13:36:44 +09:00
Junio C Hamano
aabcf7eeb8 Git 2.15.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 13:29:07 +09:00
Junio C Hamano
4375a6b2d9 Sync with Git 2.14.4
* maint-2.14:
  Git 2.14.4
  Git 2.13.7
  verify_path: disallow symlinks in .gitmodules, etc
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  is_ntfs_dotgit: match other .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths
2018-05-18 13:26:51 +09:00
Junio C Hamano
37ad15fded Git 2.14.4 2018-05-18 13:09:22 +09:00
Junio C Hamano
13f57c5daa Sync with Git 2.13.7
* maint-2.13:
  Git 2.13.7
  verify_path: disallow symlinks in .gitmodules, etc
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  is_ntfs_dotgit: match other .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths
2018-05-18 12:52:09 +09:00
Junio C Hamano
fd5a7c532f Git 2.13.7
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 12:49:35 +09:00
Junio C Hamano
0d084b175e Merge branch 'jk/submodule-name-verify-fix' into maint-2.13
* jk/submodule-name-verify-fix:
  verify_path: disallow symlinks in .gitmodules, etc
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  is_ntfs_dotgit: match other .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths
2018-05-18 12:35:02 +09:00
Jeff King
2baa638590 fsck: complain when .gitignore and .gitattributes are symlinks
This case is already forbidden by verify_path(), so let's
check it in fsck. It's easier to handle than .gitmodules,
because we don't care about checking the blob content. This
is really just about whether the name and mode for the tree
entry are valid.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 18:11:33 -07:00
Jeff King
425db067d2 fsck: complain when .gitmodules is a symlink
We've recently forbidden .gitmodules to be a symlink in
verify_path(). And it's an easy way to circumvent our fsck
checks for .gitmodules content. So let's complain when we
see it.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 18:09:36 -07:00
Jeff King
2a22dac050 index-pack: check .gitmodules files with --strict
Now that the internal fsck code has all of the plumbing we
need, we can start checking incoming .gitmodules files.
Naively, it seems like we would just need to add a call to
fsck_finish() after we've processed all of the objects. And
that would be enough to cover the initial test included
here. But there are two extra bits:

  1. We currently don't bother calling fsck_object() at all
     for blobs, since it has traditionally been a noop. We'd
     actually catch these blobs in fsck_finish() at the end,
     but it's more efficient to check them when we already
     have the object loaded in memory.

  2. The second pass done by fsck_finish() needs to access
     the objects, but we're actually indexing the pack in
     this process. In theory we could give the fsck code a
     special callback for accessing the in-pack data, but
     it's actually quite tricky:

       a. We don't have an internal efficient index mapping
	  oids to packfile offsets. We only generate it on
	  the fly as part of writing out the .idx file.

       b. We'd still have to reconstruct deltas, which means
          we'd basically have to replicate all of the
	  reading logic in packfile.c.

     Instead, let's avoid running fsck_finish() until after
     we've written out the .idx file, and then just add it
     to our internal packed_git list.

     This does mean that the objects are "in the repository"
     before we finish our fsck checks. But unpack-objects
     already exhibits this same behavior, and it's an
     acceptable tradeoff here for the same reason: the
     quarantine mechanism means that pushes will be
     fully protected.

In addition to a basic push test in t7415, we add a sneaky
pack that reverses the usual object order in the pack,
requiring that index-pack access the tree and blob during
the "finish" step.

This already works for unpack-objects (since it will have
written out loose objects), but we'll check it with this
sneaky pack for good measure.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 18:04:56 -07:00
Jeff King
7fbb4d553f unpack-objects: call fsck_finish() after fscking objects
As with the previous commit, we must call fsck's "finish"
function in order to catch any queued objects for
.gitmodules checks.

This second pass will be able to access any incoming
objects, because we will have exploded them to loose objects
by now.

This isn't quite ideal, because it means that bad objects
may have been written to the object database (and a
subsequent operation could then reference them, even if the
other side doesn't send the objects again). However, this is
sufficient when used with receive.fsckObjects, since those
loose objects will all be placed in a temporary quarantine
area that will get wiped if we find any problems.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 18:04:56 -07:00
Jeff King
6db6dbfb66 fsck: call fsck_finish() after fscking objects
Now that the internal fsck code is capable of checking
.gitmodules files, we just need to teach its callers to use
the "finish" function to check any queued objects.

With this, we can now catch the malicious case in t7415 with
git-fsck.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 18:04:56 -07:00
Jeff King
d93d55d482 fsck: check .gitmodules content
This patch detects and blocks submodule names which do not
match the policy set forth in submodule-config. These should
already be caught by the submodule code itself, but putting
the check here means that newer versions of Git can protect
older ones from malicious entries (e.g., a server with
receive.fsckObjects will block the objects, protecting
clients which fetch from it).

As a side effect, this means fsck will also complain about
.gitmodules files that cannot be parsed (or were larger than
core.bigFileThreshold).

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 18:04:50 -07:00
Jeff King
38433731ad fsck: handle promisor objects in .gitmodules check
If we have a tree that points to a .gitmodules blob but
don't have that blob, we can't check its contents. This
produces an fsck error when we encounter it.

But in the case of a promisor object, this absence is
expected, and we must not complain.  Note that this can
technically circumvent our transfer.fsckObjects check.
Imagine a client fetches a tree, but not the matching
.gitmodules blob. An fsck of the incoming objects will show
that we don't have enough information. Later, we do fetch
the actual blob. But we have no idea that it's a .gitmodules
file.

The only ways to get around this would be to re-scan all of
the existing trees whenever new ones enter (which is
expensive), or to somehow persist the gitmodules_found set
between fsck runs (which is complicated).

In practice, it's probably OK to ignore the problem. Any
repository which has all of the objects (including the one
serving the promisor packs) can perform the checks. Since
promisor packs are inherently about a hierarchical topology
in which clients rely on upstream repositories, those
upstream repositories can protect all of their downstream
clients from broken objects.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 18:03:21 -07:00
Jeff King
4a9e4c2e71 fsck: detect gitmodules files
In preparation for performing fsck checks on .gitmodules
files, this commit plumbs in the actual detection of the
files. Note that unlike most other fsck checks, this cannot
be a property of a single object: we must know that the
object is found at a ".gitmodules" path at the root tree of
a commit.

Since the fsck code only sees one object at a time, we have
to mark the related objects to fit the puzzle together. When
we see a commit we mark its tree as a root tree, and when
we see a root tree with a .gitmodules file, we mark the
corresponding blob to be checked.

In an ideal world, we'd check the objects in topological
order: commits followed by trees followed by blobs. In that
case we can avoid ever loading an object twice, since all
markings would be complete by the time we get to the marked
objects. And indeed, if we are checking a single packfile,
this is the order in which Git will generally write the
objects. But we can't count on that:

  1. git-fsck may show us the objects in arbitrary order
     (loose objects are fed in sha1 order, but we may also
     have multiple packs, and we process each pack fully in
     sequence).

  2. The type ordering is just what git-pack-objects happens
     to write now. The pack format does not require a
     specific order, and it's possible that future versions
     of Git (or a custom version trying to fool official
     Git's fsck checks!) may order it differently.

  3. We may not even be fscking all of the relevant objects
     at once. Consider pushing with transfer.fsckObjects,
     where one push adds a blob at path "foo", and then a
     second push adds the same blob at path ".gitmodules".
     The blob is not part of the second push at all, but we
     need to mark and check it.

So in the general case, we need to make up to three passes
over the objects: once to make sure we've seen all commits,
then once to cover any trees we might have missed, and then
a final pass to cover any .gitmodules blobs we found in the
second pass.

We can simplify things a bit by loosening the requirement
that we find .gitmodules only at root trees. Technically
a file like "subdir/.gitmodules" is not parsed by Git, but
it's not unreasonable for us to declare that Git is aware of
all ".gitmodules" files and make them eligible for checking.
That lets us drop the root-tree requirement, which
eliminates one pass entirely. And it makes our worst case
much better: instead of potentially queueing every root tree
to be re-examined, the worst case is that we queue each
unique .gitmodules blob for a second look.

This patch just adds the boilerplate to find .gitmodules
files. The actual content checks will come in a subsequent
commit.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 18:03:19 -07:00
Jeff King
7075cedbcc fsck: actually fsck blob data
Because fscking a blob has always been a noop, we didn't
bother passing around the blob data. In preparation for
content-level checks, let's fix up a few things:

  1. The fsck_object() function just returns success for any
     blob. Let's a noop fsck_blob(), which we can fill in
     with actual logic later.

  2. The fsck_loose() function in builtin/fsck.c
     just threw away blob content after loading it. Let's
     hold onto it until after we've called fsck_object().

     The easiest way to do this is to just drop the
     parse_loose_object() helper entirely. Incidentally,
     this also fixes a memory leak: if we successfully
     loaded the object data but did not parse it, we would
     have left the function without freeing it.

  3. When fsck_loose() loads the object data, it
     does so with a custom read_loose_object() helper. This
     function streams any blobs, regardless of size, under
     the assumption that we're only checking the sha1.

     Instead, let's actually load blobs smaller than
     big_file_threshold, as the normal object-reading
     code-paths would do. This lets us fsck small files, and
     a NULL return is an indication that the blob was so big
     that it needed to be streamed, and we can pass that
     information along to fsck_blob().

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 17:59:40 -07:00
Johannes Schindelin
b911995991 Merge branch 'colorize-push-errors'
To help users discern large chunks of white text (when the push
succeeds) from large chunks of white text (when the push fails), let's
add some color to the latter.

This closes https://github.com/git-for-windows/git/pull/1429 and fixes
https://github.com/git-for-windows/git/issues/1422

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-17 23:31:58 +02:00
Johannes Schindelin
e6c415e4e1 Document the new color.* settings to colorize push errors/hints
Let's make it easier for users to find out how to customize these colors.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-17 23:22:37 +02:00
Johannes Schindelin
0a99d05884 Add a test to verify that push errors are colorful
This actually only tests whether the push errors/hints are colored if
the respective color.* config settings are `always`, but in the regular
case they default to `auto` (in which case we color the messages when
stderr is connected to an interactive terminal), therefore these tests
should suffice.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-17 23:22:37 +02:00
Ryan Dammrose
6379dc63ad push: colorize errors
This is an attempt to resolve an issue I experience with people that are
new to Git -- especially colleagues in a team setting -- where they miss
that their push to a remote location failed because the failure and
success both return a block of white text.

An example is if I push something to a remote repository and then a
colleague attempts to push to the same remote repository and the push
fails because it requires them to pull first, but they don't notice
because a success and failure both return a block of white text. They
then continue about their business, thinking it has been successfully
pushed.

This patch colorizes the errors and hints (in red and yellow,
respectively) so whenever there is a failure when pushing to a remote
repository that fails, it is more noticeable.

[jes: fixed a couple bugs, added the color.{advice,push,transport}
settings, refactored to use want_color_stderr().]

Signed-off-by: Ryan Dammrose ryandammrose@gmail.com
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-17 23:21:58 +02:00
Johannes Schindelin
3823662341 color: introduce support for colorizing stderr
So far, we only ever asked whether stdout wants to be colorful. In the
upcoming patches, we will want to make push errors more prominent, which
are printed to stderr, though.

So let's refactor the want_color() function into a want_color_fd()
function (which expects to be called with fd == 1 or fd == 2 for stdout
and stderr, respectively), and then define the macro `want_color()` to
use the want_color_fd() function.

And then also add a macro `want_color_stderr()`, for convenience and
for documentation.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-17 23:11:26 +02:00
Johannes Schindelin
6c1fc6a76b Merge branch 'dj/runtime-prefix'
Two more commits made it into the dj/runtime-prefix branch before being
merged into core Git's `master`. Let's take those two, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-17 23:04:49 +02:00
Jonathan Nieder
8ead2533e8 Makefile: quote $INSTLIBDIR when passing it to sed
f6a0ad4b (Makefile: generate Perl header from template file,
2018-04-10) moved code for generating the 'use lib' lines at the top
of perl scripts from the $(SCRIPT_PERL_GEN) rule to a separate
GIT-PERL-HEADER rule.

This rule first populates INSTLIBDIR and then substitutes it into the
GIT-PERL-HEADER using sed:

	INSTLIBDIR=... something ...
	sed -e 's=@@INSTLIBDIR@@='$$INSTLIBDIR'=g' $< > $@

Because $INSTLIBDIR is not surrounded by double quotes, the shell
splits it at each space, causing errors if INSTLIBDIR contains an $IFS
character:

 sed: 1: "s=@@INSTLIBDIR@@=/usr/l ...": unescaped newline inside substitute pattern

Add back the missing double-quotes to make it work again.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-17 23:00:45 +02:00
Jonathan Nieder
27b491972e Makefile: remove unused @@PERLLIBDIR@@ substitution variable
Junio noticed that this variable is not quoted correctly when it is
passed to sed.  As a shell-quoted string, it should be inside
single-quotes like $(perllibdir_relative_SQ), not outside them like
$INSTLIBDIR.

In fact, this substitution variable is not used.  Simplify by removing
it.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-17 23:00:45 +02:00
Jeff King
60d36c3717 fsck: simplify ".git" check
There's no need for us to manually check for ".git"; it's a
subset of the other filesystem-specific tests. Dropping it
makes our code slightly shorter. More importantly, the
existing code may make a reader wonder why ".GIT" is not
covered here, and whether that is a bug (it isn't, as it's
also covered in the filesystem-specific tests).

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 12:17:32 -07:00
Jeff King
8ae9fe61a3 index-pack: make fsck error message more specific
If fsck reports an error, we say only "Error in object".
This isn't quite as bad as it might seem, since the fsck
code would have dumped some errors to stderr already. But it
might help to give a little more context. The earlier output
would not have even mentioned "fsck", and that may be a clue
that the "fsck.*" or "*.fsckObjects" config may be relevant.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 12:17:32 -07:00
Jeff King
97f0d2bce3 Merge branch 'jk/submodule-name-verify-fix' into jk/submodule-name-verify-fsck
* jk/submodule-name-verify-fix:
  verify_path: disallow symlinks in .gitmodules, etc
  update-index: stat updated files earlier
  verify_path: drop clever fallthrough
  skip_prefix: add icase-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  path: match NTFS short names for more .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths

Note that this includes two bits of evil-merge:

 - there's a new call to verify_path() that doesn't actually
   have a mode available. It should be OK to pass "0" here,
   since we're just manipulating the untracked cache, not an
   actual index entry.

 - the lstat() in builtin/update-index.c:update_one() needs
   to be updated to handle the fsmonitor case (without this
   it still behaves correctly, but does an unnecessary
   lstat).
2018-05-17 12:15:47 -07:00
Jeff King
8423fd8281 verify_path: disallow symlinks in .gitmodules, etc
There are a few reasons it's not a good idea to make
.git* files symlinks, including:

  1. It won't be portable to systems without symlinks.

  2. It may behave inconsistently, since Git internally may
     look at these files from the index or a tree without
     bothering to resolve any symbolic links. So it may work
     in some settings (where we read from the filesystem)
     but not in others).

With some clever code, we could make (2) work. And some
people may not care about (1) if they only work on one
platform. But there are a few security reasons to simply
disallow symlinked meta-files:

  a. A symlinked .gitmodules file may circumvent any fsck
     checks of the content.

  b. Git may read and write from the on-disk file without
     sanity checking the symlink target. So for example, if
     you link ".gitmodules" to "../oops" and run "git
     submodule add", we'll write to the file "oops" outside
     the repository.

Again, both of those are problems that _could_ be solved
with sufficient code, but given the current inconsistent
behavior and unportability, we're better off just outlawing
it explicitly.

We'll give the same treatment to .gitmodules, .gitignore,
and .gitattributes. The latter two cannot be used to write
outside the repository (we write them only as part of a
checkout, where we are careful not to follow any symlinks).
But they can still cause a "git clone && git log"
combination to read arbitrary files outside the filesystem.
There's _probably_ nothing too harmful you can do with that,
but it seems questionable (and anyway, they suffer from the
same portability and consistency problems).

Note the slightly tricky call to verify_path() in
update-index's update_one(). There we may not have a mode if
we're not updating from the filesystem (e.g., we might just
be removing the file). Passing "0" as the mode there works
fine; since it's not a symlink, we'll just skip the extra
checks.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 11:22:05 -07:00
Jeff King
3f97adce0d update-index: stat updated files earlier
In the update_one(), we check verify_path() on the proposed
path before doing anything else. In preparation for having
verify_path() look at the file mode, let's stat the file
earlier, so we can check the mode accurately.

This is made a bit trickier by the fact that this function
only does an lstat in a few code paths (the ones that flow
down through process_path()). So we can speculatively do the
lstat() here and pass the results down, and just use a dummy
mode for cases where we won't actually be updating the index
from the filesystem.
2018-05-17 11:22:05 -07:00
Jeff King
37d2e6eda1 verify_dotfile: mention case-insensitivity in comment
We're more restrictive than we need to be in matching ".GIT"
on case-sensitive filesystems; let's make a note that this
is intentional.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 11:22:05 -07:00
Jeff King
88bf2c8207 verify_path: drop clever fallthrough
We check ".git" and ".." in the same switch statement, and
fall through the cases to share the end-of-component check.
While this saves us a line or two, it makes modifying the
function much harder. Let's just write it out.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 11:22:05 -07:00
Jeff King
7632cada3d skip_prefix: add case-insensitive variant
We have the convenient skip_prefix() helper, but if you want
to do case-insensitive matching, you're stuck doing it by
hand. We could add an extra parameter to the function to
let callers ask for this, but the function is small and
somewhat performance-critical. Let's just re-implement it
for the case-insensitive version.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 11:22:05 -07:00
Johannes Schindelin
dec1e5f594 is_{hfs,ntfs}_dotgitmodules: add tests
This tests primarily for NTFS issues, but also adds one example of an
HFS+ issue.

Thanks go to Congyi Wu for coming up with the list of examples where
NTFS would possibly equate the filename with `.gitmodules`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 11:22:05 -07:00
Johannes Schindelin
e6f78689fe is_ntfs_dotgit: match other .git files
When we started to catch NTFS short names that clash with .git, we only
looked for GIT~1. This is sufficient because we only ever clone into an
empty directory, so .git is guaranteed to be the first subdirectory or
file in that directory.

However, even with a fresh clone, .gitmodules is *not* necessarily the
first file to be written that would want the NTFS short name GITMOD~1: a
malicious repository can add .gitmodul0000 and friends, which sorts
before `.gitmodules` and is therefore checked out *first*. For that
reason, we have to test not only for ~1 short names, but for others,
too.

It's hard to just adapt the existing checks in is_ntfs_dotgit(): since
Windows 2000 (i.e., in all Windows versions still supported by Git),
NTFS short names are only generated in the <prefix>~<number> form up to
number 4. After that, a *different* prefix is used, calculated from the
long file name using an undocumented, but stable algorithm.

For example, the short name of .gitmodules would be GITMOD~1, but if it
is taken, and all of ~2, ~3 and ~4 are taken, too, the short name
GI7EBA~1 will be used. From there, collisions are handled by
incrementing the number, shortening the prefix as needed (until ~9999999
is reached, in which case NTFS will not allow the file to be created).

We'd also want to handle .gitignore and .gitattributes, which suffer
from a similar problem, using the fall-back short names GI250A~1 and
GI7D29~1, respectively.

To accommodate for that, we could reimplement the hashing algorithm, but
it is just safer and simpler to provide the known prefixes. This
algorithm has been reverse-engineered and described at
https://usn.pw/blog/gen/2015/06/09/filenames/, which is defunct but
still available via https://web.archive.org/.

These can be recomputed by running the following Perl script:

-- snip --
#! /usr/bin/perl
use warnings;
use strict;

# Does *not* work for non-ASCII file names
sub compute_short_name_hash ($) {
        my $checksum = 0;
        foreach (split('', $_[0])) {
                $checksum = ($checksum * 0x25 + ord($_)) & 0xffff;
        }

        $checksum = ($checksum * 314159269) & 0xffffffff;
        $checksum = 1 + (~$checksum & 0x7fffffff) if ($checksum & 0x80000000);
        $checksum -= (($checksum * 1152921497) >> 60) * 1000000007;

        return scalar reverse sprintf("%x", $checksum & 0xffff);
}

print compute_short_name_hash($ARGV[0]);
-- snap --

E.g., running that with the argument ".gitignore" will
result in "250a" (which then becomes "gi250a" in the code).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 11:22:03 -07:00
Jeff King
797ea0ee7c is_hfs_dotgit: match other .git files
Both verify_path() and fsck match ".git", ".GIT", and other
variants specific to HFS+. Let's allow matching other
special files like ".gitmodules", which we'll later use to
enforce extra restrictions via verify_path() and fsck.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 11:10:03 -07:00
Jeff King
992318bb1a is_ntfs_dotgit: use a size_t for traversing string
We walk through the "name" string using an int, which can
wrap to a negative value and cause us to read random memory
before our array (e.g., by creating a tree with a name >2GB,
since "int" is still 32 bits even on most 64-bit platforms).
Worse, this is easy to trigger during the fsck_tree() check,
which is supposed to be protecting us from malicious
garbage.

Note one bit of trickiness in the existing code: we
sometimes assign -1 to "len" at the end of the loop, and
then rely on the "len++" in the for-loop's increment to take
it back to 0. This is still legal with a size_t, since
assigning -1 will turn into SIZE_MAX, which then wraps
around to 0 on increment.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 10:37:47 -07:00
Jeff King
afd4634d11 submodule-config: verify submodule names as paths
Submodule "names" come from the untrusted .gitmodules file,
but we blindly append them to $GIT_DIR/modules to create our
on-disk repo paths. This means you can do bad things by
putting "../" into the name (among other things).

Let's sanity-check these names to avoid building a path that
can be exploited. There are two main decisions:

  1. What should the allowed syntax be?

     It's tempting to reuse verify_path(), since submodule
     names typically come from in-repo paths. But there are
     two reasons not to:

       a. It's technically more strict than what we need, as
          we really care only about breaking out of the
          $GIT_DIR/modules/ hierarchy.  E.g., having a
          submodule named "foo/.git" isn't actually
          dangerous, and it's possible that somebody has
          manually given such a funny name.

       b. Since we'll eventually use this checking logic in
          fsck to prevent downstream repositories, it should
          be consistent across platforms. Because
          verify_path() relies on is_dir_sep(), it wouldn't
          block "foo\..\bar" on a non-Windows machine.

  2. Where should we enforce it? These days most of the
     .gitmodules reads go through submodule-config.c, so
     I've put it there in the reading step. That should
     cover all of the C code.

     We also construct the name for "git submodule add"
     inside the git-submodule.sh script. This is probably
     not a big deal for security since the name is coming
     from the user anyway, but it would be polite to remind
     them if the name they pick is invalid (and we need to
     expose the name-checker to the shell anyway for our
     test scripts).

     This patch issues a warning when reading .gitmodules
     and just ignores the related config entry completely.
     This will generally end up producing a sensible error,
     as it works the same as a .gitmodules file which is
     missing a submodule entry (so "submodule update" will
     barf, but "git clone --recurse-submodules" will print
     an error but not abort the clone.

     There is one minor oddity, which is that we print the
     warning once per malformed config key (since that's how
     the config subsystem gives us the entries). So in the
     new test, for example, the user would see three
     warnings. That's OK, since the intent is that this case
     should never come up outside of malicious repositories
     (and then it might even benefit the user to see the
     message multiple times).

Credit for finding this vulnerability and the proof of
concept from which the test script was adapted goes to
Etienne Stalmans.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-17 10:36:41 -07:00
Johannes Schindelin
aba71d8cd6 fixup! README.md: Add a Windows-specific preamble
Add a build badge, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-16 20:12:27 +02:00
Johannes Schindelin
33461f1d44 Merge pull request #1679 from telezhnaya/win
vcxproj: change build logic
2018-05-15 20:30:20 +02:00