The 'git sparse-checkout disable' subcommand returns a user to a
full working directory. The old process for doing this required
updating the sparse-checkout file with the "/*" pattern and then
updating the working directory with core.sparseCheckout enabled.
Finally, the sparse-checkout file could be removed and the config
setting disabled.
However, it is valuable to keep a user's sparse-checkout file
intact so they can re-enable the sparse-checkout they previously
used with 'git sparse-checkout init'. This is now possible with
the in-process mechanism for updating the working directory.
Reported-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout builtin used 'git read-tree -mu HEAD' to update the
skip-worktree bits in the index and to update the working directory.
This extra process is overly complex, and prone to failure. It also
requires that we write our changes to the sparse-checkout file before
trying to update the index.
Remove this extra process call by creating a direct call to
unpack_trees() in the same way 'git read-tree -mu HEAD' does. In
addition, provide an in-memory list of patterns so we can avoid
reading from the sparse-checkout file. This allows us to test a
proposed change to the file before writing to it.
An earlier version of this patch included a bug when the 'set' command
failed due to the "Sparse checkout leaves no entry on working directory"
error. It would not rollback the index.lock file, so the replay of the
old sparse-checkout specification would fail. A test in t1091 now
covers that scenario.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a user provides folders A/ and A/B/ for inclusion in a cone-mode
sparse-checkout file, the parsing logic will notice that A/ appears
both as a "parent" type pattern and as a "recursive" type pattern.
This is unexpected and hence will complain via a warning and revert
to the old logic for checking sparse-checkout patterns.
Prevent this from happening accidentally by sanitizing the folders
for this type of inclusion in the 'git sparse-checkout' builtin.
This happens in two ways:
1. Do not include any parent patterns that also appear as recursive
patterns.
2. Do not include any recursive patterns deeper than other recursive
patterns.
In order to minimize duplicate code for scanning parents, create
hashmap_contains_parent() method. It takes a strbuf buffer to
avoid reallocating a buffer when calling in a tight loop.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To make the cone pattern set easy to use, update the behavior of
'git sparse-checkout (init|set)'.
Add '--cone' flag to 'git sparse-checkout init' to set the config
option 'core.sparseCheckoutCone=true'.
When running 'git sparse-checkout set' in cone mode, a user only
needs to supply a list of recursive folder matches. Git will
automatically add the necessary parent matches for the leading
directories.
When testing 'git sparse-checkout set' in cone mode, check the
error stream to ensure we do not see any errors. Specifically,
we want to avoid the warning that the patterns do not match
the cone-mode patterns.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The parent and recursive patterns allowed by the "cone mode"
option in sparse-checkout are restrictive enough that we
can avoid using the regex parsing. Everything is based on
prefix matches, so we can use hashsets to store the prefixes
from the sparse-checkout file. When checking a path, we can
strip path entries from the path and check the hashset for
an exact match.
As a test, I created a cone-mode sparse-checkout file for the
Linux repository that actually includes every file. This was
constructed by taking every folder in the Linux repo and creating
the pattern pairs here:
/$folder/
!/$folder/*/
This resulted in a sparse-checkout file sith 8,296 patterns.
Running 'git read-tree -mu HEAD' on this file had the following
performance:
core.sparseCheckout=false: 0.21 s (0.00 s)
core.sparseCheckout=true: 3.75 s (3.50 s)
core.sparseCheckoutCone=true: 0.23 s (0.01 s)
The times in parentheses above correspond to the time spent
in the first clear_ce_flags() call, according to the trace2
performance traces.
While this example is contrived, it demonstrates how these
patterns can slow the sparse-checkout feature.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout feature can have quadratic performance as
the number of patterns and number of entries in the index grow.
If there are 1,000 patterns and 1,000,000 entries, this time can
be very significant.
Create a new Boolean config option, core.sparseCheckoutCone, to
indicate that we expect the sparse-checkout file to contain a
more limited set of patterns. This is a separate config setting
from core.sparseCheckout to avoid breaking older clients by
introducing a tri-state option.
The config option does nothing right now, but will be expanded
upon in a later commit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The instructions for disabling a sparse-checkout to a full
working directory are complicated and non-intuitive. Add a
subcommand, 'git sparse-checkout disable', to perform those
steps for the user.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout set' subcommand takes a list of patterns
and places them in the sparse-checkout file. Then, it updates the
working directory to match those patterns. For a large list of
patterns, the command-line call can get very cumbersome.
Add a '--stdin' option to instead read patterns over standard in.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout set' subcommand takes a list of patterns
as arguments and writes them to the sparse-checkout file. Then, it
updates the working directory using 'git read-tree -mu HEAD'.
The 'set' subcommand will replace the entire contents of the
sparse-checkout file. The write_patterns_and_update() method is
extracted from cmd_sparse_checkout() to make it easier to implement
'add' and/or 'remove' subcommands in the future.
If the core.sparseCheckout config setting is disabled, then enable
the config setting in the worktree config. If we set the config
this way and the sparse-checkout fails, then re-disable the config
setting.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When someone wants to clone a large repository, but plans to work
using a sparse-checkout file, they either need to do a full
checkout first and then reduce the patterns they included, or
clone with --no-checkout, set up their patterns, and then run
a checkout manually. This requires knowing a lot about the repo
shape and how sparse-checkout works.
Add a new '--sparse' option to 'git clone' that initializes the
sparse-checkout file to include the following patterns:
/*
!/*/
These patterns include every file in the root directory, but
no directories. This allows a repo to include files like a
README or a bootstrapping script to grow enlistments from that
point.
During the 'git sparse-checkout init' call, we must first look
to see if HEAD is valid, since 'git clone' does not have a valid
HEAD at the point where it initializes the sparse-checkout. The
following checkout within the clone command will create the HEAD
ref and update the working directory correctly.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Getting started with a sparse-checkout file can be daunting. Help
users start their sparse enlistment using 'git sparse-checkout init'.
This will set 'core.sparseCheckout=true' in their config, write
an initial set of patterns to the sparse-checkout file, and update
their working directory.
Make sure to use the `extensions.worktreeConfig` setting and write
the sparse checkout config to the worktree-specific config file.
This avoids confusing interactions with other worktrees.
The use of running another process for 'git read-tree' is sub-
optimal. This will be removed in a later change.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout feature is mostly hidden to users, as its
only documentation is supplementary information in the docs for
'git read-tree'. In addition, users need to know how to edit the
.git/info/sparse-checkout file with the right patterns, then run
the appropriate 'git read-tree -mu HEAD' command. Keeping the
working directory in sync with the sparse-checkout file requires
care.
Begin an effort to make the sparse-checkout feature a porcelain
feature by creating a new 'git sparse-checkout' builtin. This
builtin will be the preferred mechanism for manipulating the
sparse-checkout file and syncing the working directory.
The documentation provided is adapted from the "git read-tree"
documentation with a few edits for clarity in the new context.
Extra sections are added to hint toward a future change to
a more restricted pattern set.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index might be aware that a file hasn't modified via fsmonitor, but
unpack-trees did not pay attention to it and checked via ie_match_stat
which can be inefficient on certain filesystems. This significantly slows
down commands that run oneway_merge, like checkout and reset --hard.
This patch makes oneway_merge check whether a file is considered
unchanged through fsmonitor and skips ie_match_stat on it. unpack-trees
also now correctly copies over fsmonitor validity state from the source
index. Finally, for correctness, we force a refresh of fsmonitor state in
tweak_fsmonitor.
After this change, commands like stash (that use reset --hard
internally) go from 8s or more to ~2s on a 250k file repository on a
mac.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Utsav Shah <utsav@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code style for tests is to have statements on their own line if
possible. Move the `then` onto its own line so that it conforms with the
test style.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, if a git command fails in an unexpected way, such as a
segfault, it will be masked and ignored. Replace the ! with
test_must_fail so that only expected failures pass.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous patches, the mechanical application of changes left some
duplicate statements in the test case which were not strictly incorrect
but were redundant and possibly misleading. Remove these duplicate
statements so that it is clear that the intent behind the tests are that
the content of the file stays the same throughout the whole test case.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently have many instances of `test <line> = $(cat <file>)` and
`test $(cat <file>) = <line>`. In the case where this fails, it will be
difficult for a developer to debug since the output will be masked.
Replace these instances with invocations of test_cmp().
This change was done with the following GNU sed expressions:
s/\(\s*\)test \([^=]*\)= "$(cat \([^)]*\))"/\1echo \2>expect \&\&\n\1test_cmp expect \3/
s/\(\s*\)test "$(cat \([^)]*\))" = \([^&]*\)\( &&\)\?$/\1echo \3 >expect \&\&\n\1test_cmp expect \2\4/
A future patch will clean up situations where we have multiple duplicate
statements within a test case. This is done to keep this patch purely
mechanical.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, if the invocation of git failed, it would be masked by the pipe
since only the return code of the last element of a pipe is used.
Rewrite the test to put the git command on its own line so its return
code is not masked.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In case an invocation of a git command fails within the command
substitution, the failure will be masked. Replace the command
substitution with a file-redirection and a call to test_cmp.
This change was done with the following GNU sed expressions:
s/\(\s*\)test \([^ ]*\) = "$(\(git [^)]*\))"/\1echo \2 >expect \&\&\n\1\3 >actual \&\&\n\1test_cmp expect actual/
s/\(\s*\)test "$(\(git [^)]*\))" = \([^ ]*\)/\1echo \3 >expect \&\&\n\1\2 >actual \&\&\n\1test_cmp expect actual/
A future patch will clean up situations where we have multiple duplicate
statements within a test case. This is done to keep this patch purely
mechanical.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In case an invocation of `git rev-list` fails within the command
substitution, the failure will be masked. Remove the command
substitution and use test_cmp_rev() so that failures can be discovered.
This change was done with the following sed expressions:
s/test "$(git rev-parse.* \([^)]*\))" = "$(git rev-parse \([^)]*\))"/test_cmp_rev \1 \2/
s/test \([^ ]*\) = "$(git rev-parse.* \([^)]*\))"/test_cmp_rev \1 \2/
s/test "$(git rev-parse.* \([^)]*\))" != "$(git rev-parse.* \([^)]*\))"/test_cmp_rev ! \1 \2/
s/test \([^ ]*\) != "$(git rev-parse.* \([^)]*\))"/test_cmp_rev ! \1 \2/
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When wrapping a git command in a command substitution within another
command, we throw away the git command's exit code. In case the git
command fails, we would like to know about it rather than the failure
being silent. Extract git commands so that their exit codes are not
lost.
Instead of using `test -n` or `test -z`, replace them respectively with
invocations of test_file_not_empty() and test_must_be_empty() so that we
get better debugging information in the case of a failure.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of rolling our own functionality to test the number of lines a
command outputs, use test_line_count() which provides better debugging
information in the case of a failure.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The style for tests in Git is to have the redirect operator attached to
the filename with no spaces. Fix test cases where this is not the case.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although `test -f` has the same functionality as test_path_is_file(), in
the case where test_path_is_file() fails, we get much better debugging
information.
Replace `test -f` with test_path_is_file() so that future developers
will have a better experience debugging these test cases.
Also, in the case of `! test -f`, not only should that path not be a
file, it shouldn't exist at all so replace it with
test_path_is_missing().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We were using a redirection operator to feed input into sed. However,
since sed is capable of opening its own files, make sed open its own
files instead of redirecting input into it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The usual convention is for test case names to be written between
single-quotes. Change all double-quoted test case names to single-quotes
except for two test case names that use variables within.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the test style by removing leading and trailing empty lines
within test cases. Also, reformat multi-line subshells to conform to the
existing style.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the case where we are using test_cmp_rev() to report not-equals, we
write `! test_cmp_rev`. However, since test_cmp_rev() contains
r1=$(git rev-parse --verify "$1") &&
r2=$(git rev-parse --verify "$2") &&
`! test_cmp_rev` will succeed if any of the rev-parses fail. This
behavior is not desired. We want the rev-parses to _always_ be
successful.
Rewrite test_cmp_rev() to optionally accept "!" as the first argument to
do a not-equals comparison. Rewrite `! test_cmp_rev` to `test_cmp_rev !`
in all tests to take advantage of this new functionality.
Also, rewrite the rev-parse logic to end with a `|| return 1` instead of
&&-chaining into the rev-comparison logic. This makes it obvious to
future readers that we explicitly intend on returning early if either of
the rev-parses fail.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to POSIX enhancement request '0000767: Add built-in
"local"'[1],
dash only allows one variable in a local definition; it permits
assignment though it doesn't document that clearly.
however, this isn't true since t0000 still passes with this patch
applied on dash 0.5.10.2. Needless to say, since `local` isn't POSIX
standardized, it is not exactly clear what `local` entails on different
versions of different shells.
We currently already have many instances of multiple local assignments
in our codebase. Ensure that this is actually supported by explicitly
testing that it is sane.
[1]: http://austingroupbugs.net/bug_view_page.php?bug_id=767
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since format-patch accepts `--[no-]notes`, one would expect the
range-diff generated to also respect the setting. Unfortunately, the
range-diff we currently generate only uses the default option (which
always outputs default notes, even when notes are not being used
elsewhere).
Pass the notes configuration to range-diff so that it can honor it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a commit being range-diff'd has a note attached to it, the note
will be compared as well. However, if a user has multiple notes refs or
if they want to suppress notes from being printed, there is currently no
way to do this.
Pass through `--[no-]notes[=<ref>]` to the `git log` call so that this
option is customizable.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When notes were included in the output of range-diff, they were just
mashed together with the rest of the commit message. As a result, users
wouldn't be able to clearly distinguish where the commit message ended
and where the notes started.
Output a `## Notes ##` header when notes are detected so that notes can
be compared more clearly.
Note that we handle case of `Notes (<ref>): -> ## Notes (<ref>) ##` with
this code as well. We can't test this in this patch, however, since
there is currently no way to pass along different notes refs to `git
log`. This will be fixed in a future patch.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test suite had a blindspot where it did not check the behavior of
range-diff and format-patch when notes were present. Cover this
blindspot.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For test cases, the usual convention is to name expected output files
"expect", not "expected". Replace all instances of "expected" with
"expect".
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the first heredoc, parameter substitution is not used so prevent it
from happening in the future (perhaps by accident) by escaping the limit
EOF.
The remaining heredocs use parameter substitution so they cannot be
changed.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove the
one instance of this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python's async functions (declared with "async def" rather than "def")
were not being displayed in hunk headers. This commit teaches git about
the async function syntax, and adds tests for the Python userdiff regex.
Signed-off-by: Josh Holland <anowlcalledjosh@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The standard format for referencing other commits within some projects
(such as git.git) is the reference format. This is described in
Documentation/SubmittingPatches as
If you want to reference a previous commit in the history of a stable
branch, use the format "abbreviated hash (subject, date)", like this:
....
Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30)
noticed that ...
....
Since this format is so commonly used, standardize it as a pretty
format.
The tests that are implemented essentially show that the format-string
does not change in response to various log options. This is useful
because, for future developers, it shows that we've considered the
limitations of the "canned format-string" approach and we are fine with
them.
Based-on-a-patch-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the placeholders %as and %cs to format author date and committer
date, respectively, without the time part, like --date=short does, i.e.
like YYYY-MM-DD.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test suite does not include any tests where `--reflog` and `-z` are
used together in `git log`. Cover this blindspot. Note that the
`--pretty=oneline` case is written separately because it follows a
slightly different codepath.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) For now, `--pathspec-from-file` is declared incompatible with
`--interactive/--patch`, even when <file> is not `stdin`. Such use
case it not really expected. Also, it would require changes to
`interactive_add()`.
2) It is not allowed to pass pathspec in both args and file.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) For now, `--pathspec-from-file` is declared incompatible with
`--patch`, even when <file> is not `stdin`. Such use case it not
really expected. Also, it is harder to support in `git commit`, so
I decided to make it incompatible in all places.
2) It is not allowed to pass pathspec in both args and file.
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to work correctly, git-rebase --rebase-merges needs to make
initial todo list with unique labels.
Those unique labels is being handled by employing a hashmap and
appending an unique number if any duplicate is found.
But, we forget that beside those labels for side branches,
we also have a special label `onto' for our so-called new-base.
In a special case that any of those labels for side branches named
`onto', git will run into trouble.
Correct it.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since it was introduced in 7cceca5ccc (Add 'git rev-parse
--show-toplevel' option., 2010-01-12), the --show-toplevel option has
treated a missing working tree as a quiet success: it neither prints a
toplevel path, but nor does it report any kind of error.
While a caller could distinguish this case by looking for an empty
response, the behavior is rather confusing. We're better off complaining
that there is no working tree, as other internal commands would do in
similar cases (e.g., "git status" or any builtin with NEED_WORK_TREE set
would just die()). So let's do the same here.
While we're at it, let's clarify the documentation and add some tests,
both for the new behavior and for the more mundane case (which was not
covered).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `label` todo command in interactive rebases creates temporary refs
in the `refs/rewritten/` namespace. These refs are stored as loose refs,
i.e. as files in `.git/refs/rewritten/`, therefore they have to conform
with file name limitations on the current filesystem in addition to the
accepted ref format.
This poses a problem in particular on NTFS/FAT, where e.g. the colon,
double-quote and pipe characters are disallowed as part of a file name.
Let's safeguard against this by replacing not only white-space
characters by dashes, but all non-alpha-numeric ones.
However, we exempt non-ASCII UTF-8 characters from that, as it should be
quite possible to reflect branch names such as `↯↯↯` in refs/file names.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This imitates the code to show the help text from the Perl script
`git-add--interactive.perl` in the built-in version.
To make sure that it renders exactly like the Perl version of `git add
-i`, we also add a test case for that to `t3701-add-interactive.sh`.
Signed-off-by: Slavica Đukić <slawica92@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a cross-site scripting problem in gitweb, where it will print
URLs generated by its href() helper without further quoting. This allows
an attacker to point a victim to a specially crafted gitweb URL and
inject arbitrary HTML into the resulting page (which the victim sees as
coming from gitweb).
The base of the URL comes from evaluate_uri(), which pulls the value of
$REQUEST_URI via the CGI module. It tries to strip off $PATH_INFO, but
fails to do so in some cases (including ones that contain special
characters, like "+"). Most of the uses of the URL end up being passed
to "$cgi->a(-href = href())", which will get quoted properly by the CGI
module. But in a few places, we output them ourselves as part of
manually-generated HTML, and whatever was in the original URL will
appear unquoted in the output.
Given that all of the nearby variables placed into this manual HTML
_are_ quoted, it seems like the authors assumed that these URLs would
not need quoting. So it's possible that the bug is actually in
evaluate_uri(), which should be doing a more careful job of stripping
$PATH_INFO. There's some discussion in a comment in that function, as
well as the commit message in 81d3fe9f48 (gitweb: fix wrong base URL
when non-root DirectoryIndex, 2009-02-15). But I'm not sure I understand
it.
Regardless, it's a good idea to quote these values at the point of
insertion into the HTML output:
1. Even if there is a bug in evaluate_uri(), this would give us
belt-and-suspenders protection.
2. evaluate_uri() is only handling the base. Some generated URLs will
also mention arbitrary refs or filenames in the repositories, and
these should be quoted anyway.
3. It should never _hurt_ to quote (and that's what all of the
$cgi->a() calls are doing already).
So there may be further work here, but this patch at least prevents the
XSS vulnerability, and shouldn't make anything worse.
The test here covers the calls in print_feed_meta(), but I manually
audited every call to href() to see how its output was used, and quoted
appropriately. Most of them are esc_attr(), as they're used in tag
attributes, but I used esc_html() when the URLs were printed bare. The
distinction is largely academic, as one is implemented as a wrapper for
the other.
Reported-by: NAKAYAMA DAISUKE <nakyamad@icloud.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a real webserver's CGI call, gitweb.cgi would typically see
$REQUEST_URI set. This variable does impact how we display our URL in
the resulting page, so let's try to make our test as realistic as
possible (we can just use the $PATH_INFO our caller passed in, if any).
This doesn't change the outcome of any tests, but it will help us add
some new tests in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some variables assignments in gitweb_run() look like this:
FOO=""$1""
The extra quotes aren't doing anything. Each set opens and closes an
empty string, and $1 is actually outside of any double-quotes (which is
OK, because variable assignment does not do whitespace splitting on the
expanded value).
Let's drop them, as they're simply confusing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is just a thin wrapper around gitweb_run(), which takes
multiple arguments. But we only pass along "$1". Let's pass everything
we get, which will let a future patch add an XSS test that affects
PATH_INFO (which gitweb_run() takes as $2).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>