global/git

mirror of https://github.com/git/git.git synced 2026-04-08 16:00:09 +02:00

Author	SHA1	Message	Date
SZEDER Gábor	c525ce95b4	commit-graph: check all leading directories in changed path Bloom filters The file 'dir/subdir/file' can only be modified if its leading directories 'dir' and 'dir/subdir' are modified as well. So when checking modified path Bloom filters looking for commits modifying a path with multiple path components, then check not only the full path in the Bloom filters, but all its leading directories as well. Take care to check these paths in "deepest first" order, because it's the full path that is least likely to be modified, and the Bloom filter queries can short circuit sooner. This can significantly reduce the average false positive rate, by about an order of magnitude or three(!), and can further speed up pathspec-limited revision walks. The table below compares the average false positive rate and runtime of git rev-list HEAD -- "$path" before and after this change for 5000+ randomly* selected paths from each repository: Average false Average Average positive rate runtime runtime before after before after difference ------------------------------------------------------------------ git 3.220% 0.7853% 0.0558s 0.0387s -30.6% linux 2.453% 0.0296% 0.1046s 0.0766s -26.8% tensorflow 2.536% 0.6977% 0.0594s 0.0420s -29.2% *Path selection was done with the following pipeline: git ls-tree -r --name-only HEAD \| sort -R \| head -n 5000 The improvements in runtime are much smaller than the improvements in average false positive rate, as we are clearly reaching diminishing returns here. However, all these timings depend on that accessing tree objects is reasonably fast (warm caches). If we had a partial clone and the tree objects had to be fetched from a promisor remote, e.g.: $ git clone --filter=tree:0 --bare file://.../webkit.git webkit.notrees.git $ git -C webkit.git -c core.modifiedPathBloomFilters=1 \ commit-graph write --reachable $ cp webkit.git/objects/info/commit-graph webkit.notrees.git/objects/info/ $ git -C webkit.notrees.git -c core.modifiedPathBloomFilters=1 \ rev-list HEAD -- "$path" then checking all leading path component can reduce the runtime from over an hour to a few seconds (and this is with the clone and the promisor on the same machine). This adjusts the tracing values in t4216-log-bloom.sh, which provides a concrete way to notice the improvement. Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-01 14:17:43 -07:00
Taylor Blau	f3c2a36810	revision: empty pathspecs should not use Bloom filters The prepare_to_use_bloom_filter() method was not intended to be called on an empty pathspec. However, 'git log -- .' and 'git log' are subtly different: the latter reports all commits while the former will simplify commits that do not change the root tree. This means that the path used to construct the bloom_key might be empty, and that value is not added to the Bloom filter during construction. That means that the results are likely incorrect! To resolve the issue, be careful about the length of the path and stop filling Bloom filters. To be completely sure we do not use them, drop the pointer to the bloom_filter_settings from the commit-graph. That allows our test to look at the trace2 logs to verify no Bloom filter statistics are reported. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-01 14:17:43 -07:00
Derrick Stolee	0087a87ba8	commit-graph: persist existence of changed-paths The changed-path Bloom filters were released in v2.27.0, but have a significant drawback. A user can opt-in to writing the changed-path filters using the "--changed-paths" option to "git commit-graph write" but the next write will drop the filters unless that option is specified. This becomes even more important when considering the interaction with gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of features.experimental). These config options trigger commit-graph writes that the user did not signal, and hence there is no --changed-paths option available. Allow a user that opts-in to the changed-path filters to persist the property of "my commit-graph has changed-path filters" automatically. A user can drop filters using the --no-changed-paths option. In the process, we need to be extremely careful to match the Bloom filter settings as specified by the commit-graph. This will allow future versions of Git to customize these settings, and the version with this change will persist those settings as commit-graphs are rewritten on top. Use the trace2 API to signal the settings used during the write, and check that output in a test after manually adjusting the correct bytes in the commit-graph file. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-01 14:17:43 -07:00
Derrick Stolee	949197420e	bloom: fix logic in get_bloom_filter() The get_bloom_filter() method is a bit complicated in some parts where it does not need to be. In particular, it needs to return a NULL filter only when compute_if_not_present is zero AND the filter data cannot be loaded from a commit-graph file. This currently happens by accident because the commit-graph does not load changed-path Bloom filters from an existing commit-graph when writing a new one. This will change in a later patch. Also clean up some style issues while we are here. One side-effect of returning a NULL filter is that the filters that are reported as "too large" will now be reported as NULL insead of length zero. This case was not properly covered before, so add a test. Further, remote the counting of the zero-length filters from revision.c and the trace2 logs. Helped-by: René Scharfe <l.s.r@web.de> Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-01 14:17:43 -07:00
Junio C Hamano	298d704e70	Merge branch 'sk/diff-files-show-i-t-a-as-new' "git diff-files" has been taught to say paths that are marked as intent-to-add are new files, not modified from an empty blob. * sk/diff-files-show-i-t-a-as-new: diff-files: treat "i-t-a" files as "not-in-index"	2020-06-29 14:17:27 -07:00
Junio C Hamano	1033b98291	Merge branch 'xl/upgrade-repo-format' Allow runtime upgrade of the repository format version, which needs to be done carefully. There is a rather unpleasant backward compatibility worry with the last step of this series, but it is the right thing to do in the longer term. * xl/upgrade-repo-format: check_repository_format_gently(): refuse extensions for old repositories sparse-checkout: upgrade repository to version 1 when enabling extension fetch: allow adding a filter after initial clone repository: add a helper function to perform repository format upgrade	2020-06-29 14:17:24 -07:00
Jeff King	8a49495583	fast-export: anonymize "master" refname Running "fast-export --anonymize" will leave "refs/heads/master" untouched in the output, for two reasons: - it helped to have some known reference point between the original and anonymized repository - since it's historically the default branch name, it doesn't leak any information Now that we can ask fast-export to retain particular tokens, we have a much better tool for the first one (because it works for any ref, not just master). For the second, the notion of "default branch name" is likely to become configurable soon, at which point the name _does_ leak information. Let's drop this special case in preparation. Note that we have to adjust the test a bit, since it relied on using the name "master" in the anonymized repos. We could just use --anonymize-map=master to keep the same output, but then we wouldn't know if it works because of our hard-coded master or because of the explicit map. So let's flip the test a bit, and confirm that we anonymize "master", but keep "other" in the output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-25 14:19:23 -07:00
Jeff King	65b5d9fae7	fast-export: allow seeding the anonymized mapping After you anonymize a repository, it can be hard to find which commits correspond between the original and the result, and thus hard to reproduce commands that triggered bugs in the original. Let's make it possible to seed the anonymization map. This lets users either: - mark names to be retained as-is, if they don't consider them secret (in which case their original commands would just work) - map names to new values, which lets them adapt the reproduction recipe to the new names without revealing the originals The implementation is fairly straight-forward. We already store each anonymized token in a hashmap (so that the same token appearing twice is converted to the same result). We can just introduce a new "seed" hashmap which is consulted first. This does make a few more promises to the user about how we'll anonymize things (e.g., token-splitting pathnames). But it's unlikely that we'd want to change those rules, even if the actual anonymization of a single token changes. And it makes things much easier for the user, who can unblind only a directory name without having to specify each path within it. One alternative to this approach would be to anonymize as we see fit, and then dump the whole refname and pathname mappings to a file. This does work, but it's a bit awkward to use (you have to manually dig the items you care about out of the mapping). Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-25 14:19:23 -07:00
Junio C Hamano	f33b5bddaf	Merge branch 'pb/t4014-unslave' A branch name used in a test has been clarified to match what is going on. * pb/t4014-unslave: t4014: do not use "slave branch" nomenclature	2020-06-25 12:27:48 -07:00
Junio C Hamano	34e849b05a	Merge branch 'jt/cdn-offload' The "fetch/clone" protocol has been updated to allow the server to instruct the clients to grab pre-packaged packfile(s) in addition to the packed object data coming over the wire. * jt/cdn-offload: upload-pack: fix a sparse '0 as NULL pointer' warning upload-pack: send part of packfile response as uri fetch-pack: support more than one pack lockfile upload-pack: refactor reading of pack-objects out Documentation: add Packfile URIs design doc Documentation: order protocol v2 sections http-fetch: support fetching packfiles by URL http-fetch: refactor into function http: refactor finish_http_pack_request() http: use --stdin when indexing dumb HTTP pack	2020-06-25 12:27:47 -07:00
Junio C Hamano	7b2685ef2d	Merge branch 'dl/branch-cleanup' Code clean-up around "git branch" with a minor bugfix. * dl/branch-cleanup: branch: don't mix --edit-description t3200: test for specific errors t3200: rename "expected" to "expect"	2020-06-25 12:27:47 -07:00
Junio C Hamano	1457886ce2	Merge branch 'ct/diff-with-merge-base-clarification' "git diff" used to take arguments in random and nonsense range notation, e.g. "git diff A..B C", "git diff A..B C...D", etc., which has been cleaned up. * ct/diff-with-merge-base-clarification: Documentation: usage for diff combined commits git diff: improve range handling t/t3430: avoid undefined git diff behavior	2020-06-25 12:27:46 -07:00
Junio C Hamano	320421840e	Merge branch 'jk/complete-git-switch' The command line completion (in contrib/) learned to complete options that the "git switch" command takes. * jk/complete-git-switch: completion: improve handling of --orphan option of switch/checkout completion: improve handling of -c/-C and -b/-B in switch/checkout completion: improve handling of --track in switch/checkout completion: improve handling of --detach in checkout completion: improve completion for git switch with no options completion: improve handling of DWIM mode for switch/checkout completion: perform DWIM logic directly in __git_complete_refs completion: extract function __git_dwim_remote_heads completion: replace overloaded track term for __git_complete_refs completion: add tests showing subpar switch/checkout --orphan logic completion: add tests showing subpar -c/C argument completion completion: add tests showing subpar -c/-C startpoint completion completion: add tests showing subpar switch/checkout --track logic completion: add tests showing subar checkout --detach logic completion: add tests showing subpar DWIM logic for switch/checkout completion: add test showing subpar git switch completion	2020-06-25 12:27:45 -07:00
Johannes Schindelin	6dca5dbf93	tests: reference `seen` wherever `pu` was referenced As our test suite partially reflects how we work in the Git project, it is natural that the branch name `pu` was used in a couple places. Since that branch was renamed to `seen`, let's use the new name consistently. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-25 09:18:56 -07:00
Johannes Schindelin	0068f2116e	testsvn: respect `init.defaultBranch` The default name of the initial branch in new repositories can now be configured. The `testsvn` remote helper translates the remote Subversion repository's branch name `trunk` to the hard-coded name `master`. Clearly, the intention was to make the name align with Git's defaults. So while we are not talking about a newly-created repository in the `testsvn` context, it is a newly-created _Git_ repository, si it _still_ makes sense to use the overridden default name for the initial branch whenever users configured it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Johannes Schindelin	a471214bd6	remote: use the configured default branch name when appropriate When guessing the default branch name of a remote, and there are no refs to guess from, we want to go with the preference specified by the user for the fall-back, i.e. the default name to be used for the initial branch of new repositories (because as far as the user is concerned, a remote that has no branches yet is a new repository). At the same time, when talking to an older Git server that does not report a symref for `HEAD` (but instead reports a commit hash), let's try to guess the configured default branch name first. If it does not match the reported commit hash, let's fall back to `master` as before. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Johannes Schindelin	0cc1b475bb	clone: use configured default branch name when appropriate When cloning a repository without any branches, Git chooses a default branch name for the as-yet unborn branch. As part of the implicit initialization of the local repository, Git just learned to respect `init.defaultBranch` to choose a different initial branch name. We now really want that branch name to be used as a fall-back. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Don Goodman-Wilson	8747ebb7cd	init: allow setting the default for the initial branch name via the config We just introduced the command-line option `--initial-branch=<branch-name>` to allow initializing a new repository with a different initial branch than the hard-coded one. To allow users to override the initial branch name more permanently (i.e. without having to specify the name manually for each and every `git init` invocation), let's introduce the `init.defaultBranch` config setting. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Don Goodman-Wilson <don@goodman-wilson.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Johannes Schindelin	32ba12dab2	init: allow specifying the initial branch name for the new repository There is a growing number of projects and companies desiring to change the main branch name of their repositories (see e.g. https://twitter.com/mislav/status/1270388510684598272 for background on this). To change that branch name for new repositories, currently the only way to do that automatically is by copying all of Git's template directory, then hard-coding the desired default branch name into the `.git/HEAD` file, and then configuring `init.templateDir` to point to those copied template files. To make this process much less cumbersome, let's introduce a new option: `--initial-branch=<branch-name>`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Johannes Schindelin	f0a96e8d4c	submodule: fall back to remote's HEAD for missing remote.<name>.branch When `remote.<name>.branch` is not configured, `git submodule update` currently falls back to using the branch name `master`. A much better idea, however, is to use the remote `HEAD`: on all Git servers running reasonably recent Git versions, the symref `HEAD` points to the main branch. Note: t7419 demonstrates that there _might_ be use cases out there that _expect_ `git submodule update --remote` to update submodules to the remote `master` branch even if the remote `HEAD` points to another branch. Arguably, this patch makes the behavior more intuitive, but there is a slight possibility that this might cause regressions in obscure setups. Even so, it should be okay to fix this behavior without anything like a longer transition period: - The `git submodule update --remote` command is not really common. - Current Git's behavior when running this command is outright confusing, unless the remote repository's current branch _is_ `master` (in which case the proposed behavior matches the old behavior). - If a user encounters a regression due to the changed behavior, the fix is actually trivial: setting `submodule.<name>.branch` to `master` will reinstate the old behavior. Helped-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Johannes Schindelin	4d04658d8b	send-pack/transport-helper: avoid mentioning a particular branch When trying to push all matching branches, but none match, we offer a message suggesting to push the `master` branch. However, we want to step away from making that branch any more special than any other branch, so let's reword that message to mention no branch in particular. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Denton Liu	5b0ac09fb1	lib-submodule-update: pass 'test_must_fail' as an argument When we run a test helper function in test_submodule_switch_common(), we sometimes specify a whole helper function as the $command. When we do this, in some test cases, we just mark the whole function with `test_must_fail`. However, it's possible that the helper function might fail earlier or later than expected due to an introduced bug. If this happens, then the test case will still report as passing but it should really be marked as failing since it didn't actually display the intended behaviour. Instead of invoking `test_must_fail $command`, pass the string "test_must_fail" as the second argument in case where the git command is expected to fail. When $command is a helper function, the parent function calling test_submodule_switch_common() is test_submodule_switch_func(). For all test_submodule_switch_func() invocations, increase the granularity of the argument test helper function by prefixing the git invocation which is meant to fail with the second argument like this: $2 git checkout "$1" In the other cases, test_submodule_switch() and test_submodule_forced_switch(), instead of passing in the git command directly, wrap it using the git_test_func() and pass the git arguments using the global variable $gitcmd. Unfortunately, since closures aren't a thing in shell scripts, the global variable is necessary. Another unfortunate result is that the "git_test_func" will used as the test case name when $command is printed but it's worth it for the cleaner code. Finally, as an added bonus, `test_must_fail` will now only run on git commands. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 08:54:18 -07:00
Jeff King	b897bf5f37	fast-export: use xmemdupz() for anonymizing oids Our anonymize_mem() function is careful to take a ptr/len pair to allow storing binary tokens like object ids, as well as partial strings (e.g., just "foo" of "foo/bar"). But it duplicates the hash key using xstrdup()! That means that: - for a partial string, we'd store all bytes up to the NUL, even though we'd never look at anything past "len". This didn't produce wrong behavior, but was wasteful. - for a binary oid that doesn't contain a zero byte, we'd copy garbage bytes off the end of the array (though as long as nothing complained about reading uninitialized bytes, further reads would be limited by "len", and we'd produce the correct results) - for a binary oid that does contain a zero byte, we'd copy _fewer_ bytes than intended into the hashmap struct. When we later try to look up a value, we'd access uninitialized memory and potentially falsely claim that a particular oid is not present. The most common reason to store an oid is an anonymized gitlink, but our test case doesn't have any gitlinks at all. So let's add one whose oid contains a NUL and is present at two different paths. ASan catches the memory error, but even without it we can detect the bug because the oid is not anonymized the same way for both paths. And of course the fix is to copy the correct number of bytes. We don't technically need the appended NUL from xmemdupz(), but it doesn't hurt as an extra protection against anybody treating it like a string (plus a future patch will push us more in that direction). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00
Jeff King	b8c0689bb9	t9351: derive anonymized tree checks from original repo Our tests of the anonymized repo just hard-code the expected set of objects in the root and subdirectory trees. This makes them brittle to the test setup changing (e.g., adding new paths that need tested). Let's look at the original repo to compute our expected set of objects. Note that this isn't completely perfect (e.g., we still rely on there being only one tree in the root), but it does simplify later patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00
Johannes Schindelin	489947cee5	fmt-merge-msg: stop treating `master` specially In the context of many projects renaming their primary branch names away from `master`, Git wants to stop treating the `master` branch specially. Let's start with `git fmt-merge-msg`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 17:22:35 -07:00
Derrick Stolee	7b671f8c2b	commit-graph: change test to die on parse, not load `43d3561` (commit-graph write: don't die if the existing graph is corrupt, 2019-03-25) introduced the GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD environment variable. This was created to verify that commit-graph was not loaded when writing a new non-incremental commit-graph. An upcoming change wants to load a commit-graph in some valuable cases, but we want to maintain that we don't trust the commit-graph data when writing our new file. Instead of dying on load, instead die if we ever try to parse a commit from the commit-graph. This functionally verifies the same intended behavior, but allows a more advanced feature in the next change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 17:12:08 -07:00
Carlo Marcelo Arenas Belón	c1ea625f72	commit-reach: avoid is_descendant_of() shim `d91d6fbf26` (commit-reach: create repo_is_descendant_of(), 2020-06-17) adds a repository aware version of is_descendant_of() and a backward compatibility shim that is barely used. Update all callers to directly use the new repo_is_descendant_of() function instead; making the codebase simpler and pushing more the_repository references higher up the stack. Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 16:36:53 -07:00
brian m. carlson	64472d15e9	http-push: ensure unforced pushes fail when data would be lost When we push using the DAV-based protocol, the client is the one that performs the ref updates and therefore makes the checks to see whether an unforced push should be allowed. We make this check by determining if either (a) we lack the object file for the old value of the ref or (b) the new value of the ref is not newer than the old value, and in either case, reject the push. However, the ref_newer function, which performs this latter check, has an odd behavior due to the reuse of certain object flags. Specifically, it will incorrectly return false in its first invocation and then correctly return true on a subsequent invocation. This occurs because the object flags used by http-push.c are the same as those used by commit-reach.c, which implements ref_newer, and one piece of code misinterprets the flags set by the other. Note that this does not occur in all cases. For example, if the example used in the tests is changed to use one repository instead of two and rewind the head to add a commit, the test passes and we correctly reject the push. However, the example provided does trigger this behavior, and the code has been broken in this way since at least Git 2.0.0. To solve this problem, let's move the two sets of object flags so that they don't overlap, since we're clearly using them at the same time. The new set should not conflict with other usage because other users are either builtin code (which is not compiled into git http-push) or upload-pack (which we similarly do not use here). Reported-by: Michael Ward <mward@smartsoftwareinc.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 15:40:59 -07:00
Junio C Hamano	9740ef888e	Merge branch 'es/worktree-duplicate-paths' The same worktree directory must be registered only once, but "git worktree move" allowed this invariant to be violated, which has been corrected. * es/worktree-duplicate-paths: worktree: make "move" refuse to move atop missing registered worktree worktree: generalize candidate worktree path validation worktree: prune linked worktree referencing main worktree path worktree: prune duplicate entries referencing same worktree path worktree: make high-level pruning re-usable worktree: give "should be pruned?" function more meaningful name worktree: factor out repeated string literal	2020-06-22 15:55:03 -07:00
Junio C Hamano	b8a5299594	Merge branch 'jt/redact-all-cookies' The interface to redact sensitive information in the trace output has been simplified. * jt/redact-all-cookies: http: redact all cookies, teach GIT_TRACE_REDACT=0	2020-06-22 15:55:02 -07:00
brian m. carlson	148f193d16	t/lib-git-svn: make hash size independent The record size used in the git svn storage is four bytes plus the length of the binary hash. Pass the hash length into our Perl invocation and use it to compute the size of the records. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 11:21:07 -07:00
Srinidhi Kaushik	feea6946a5	diff-files: treat "i-t-a" files as "not-in-index" The `diff-files' command and related commands which call the function `cmd_diff_files()', consider the "intent-to-add" files as a part of the index when comparing the work-tree against it. This was previously addressed in commits [1] and [2] by turning the option `--ita-invisible-in-index' (introduced in [3]) on by default. For `diff-files' (and `add -p' as a consequence) to show the i-t-a files as as new, `ita_invisible_in_index' will be enabled by default here as well. [1] `0231ae71d3` (diff: turn --ita-invisible-in-index on by default, 2018-05-26) [2] `425a28e0a4` (diff-lib: allow ita entries treated as "not yet exist in index", 2016-10-24) [3] `b42b451919` (diff: add --ita-[in]visible-in-index, 2016-10-24) Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 10:46:45 -07:00
Eric Sunshine	03f2465bb1	worktree: drop get_worktrees() unused 'flags' argument get_worktrees() accepts a 'flags' argument, however, there are no existing flags (the lone flag GWT_SORT_LINKED was recently retired) and no behavior which can be tweaked. Therefore, drop the 'flags' argument. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 10:31:15 -07:00
brian m. carlson	3e04b6e1b6	t9101: make hash independent Instead of hard-coding the object ID for our test .gitignore file, let's compute it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 09:52:02 -07:00
brian m. carlson	bbe0616cd8	t9104: make hash size independent The size of a record in the database used by git svn is four bytes plus the length of the binary hash. Instead of hard-coding 24, compute this value based on the size of the hash in use. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 09:52:02 -07:00
brian m. carlson	407527ba44	t9100: make test work with SHA-256 Compute the relevant tree objects for SHA-256 and use those when appropriate instead of using the SHA-1 ones. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 09:52:02 -07:00
brian m. carlson	606b9749c6	t9108: make test hash independent Instead of stripping off the first 41 characters of git log output, let's just strip off the first space-separated component, which will work for any size hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 09:52:02 -07:00
brian m. carlson	5aa6877540	t9168: make test hash independent Instead of stripping off the first 41 characters of git log output, let's just strip off the first space-separated component, which will work for any size hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 09:52:02 -07:00
brian m. carlson	62814dfd17	t9109: make test hash independent Instead of stripping off the first 41 characters of git log output, let's just strip off the first space-separated component, which will work for any size hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 09:52:02 -07:00
brian m. carlson	3716d50dd5	remote-testgit: adapt for object-format When using an algorithm other than SHA-1, we need the remote helper to advertise support for the object-format extension and provide information back to us so that we can properly parse refs and return data. Ensure that the test remote helper understands these extensions. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:09 -07:00
brian m. carlson	371c4079f4	t5300: pass --object-format to git index-pack git index-pack by default reads the repository to determine the object format. However, when outside of a repository, it's necessary to specify the hash algorithm in use so that the pack can be properly indexed. Add an --object-format argument when invoking git index-pack outside of a repository. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:09 -07:00
brian m. carlson	4ddd3f5063	t5704: send object-format capability with SHA-256 When we speak protocol v2 in this test, we must pass the object-format header if the algorithm is not SHA-1. Otherwise, git upload-pack fails because the hash algorithm doesn't match and not because we've failed to speak the protocol correctly. Pass the header so that our assertions test what we're really interested in. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:09 -07:00
brian m. carlson	f7c6a3bf08	t5703: use object-format serve option When we're using an algorithm other than SHA-1, we need to specify the algorithm in use so we don't get a failure with an "unknown format" message. Add a wrapper function that specifies this header if required. Skip specifying this header for SHA-1 to test that it works both with an without this header. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:09 -07:00
brian m. carlson	8fc7003540	t5702: offer an object-format capability in the test In order to make this test work with SHA-256, offer an object-format capability so that both sides use the same algorithm. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:09 -07:00
brian m. carlson	54cbbe4c6e	t/helper: initialize the repository for test-sha1-array test-sha1-array uses the_hash_algo under the hood. Since t0064 wants to use the value that is correct for the hash algorithm that we're testing, make sure the test helper initializes the repository to set the_hash_algo correctly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:08 -07:00
brian m. carlson	793731f742	t1050: pass algorithm to index-pack when outside repo When outside a repository, git index-pack is unable to guess the hash algorithm in use for a pack, since packs don't contain any information on the algorithm in use. Pass an option to index-pack to help it out in this test. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:08 -07:00
brian m. carlson	ac093d0790	remote-curl: detect algorithm for dumb HTTP by size When reading the info/refs file for a repository, we have no explicit way to detect which hash algorithm is in use because the file doesn't provide one. Detect the hash algorithm in use by the size of the first object ID. If we have an empty repository, we don't know what the hash algorithm is on the remote side, so default to whatever the local side has configured. Without doing this, we cannot clone an empty repository since we don't know its hash algorithm. Test this case appropriately, since we currently have no tests for cloning an empty repository with the dumb HTTP protocol. We anonymize the URL like elsewhere in the function in case the user has decided to include a secret in the URL. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:08 -07:00
Patrick Steinhardt	6754159767	refs: implement reference transaction hook The low-level reference transactions used to update references are currently completely opaque to the user. While certainly desirable in most usecases, there are some which might want to hook into the transaction to observe all queued reference updates as well as observing the abortion or commit of a prepared transaction. One such usecase would be to have a set of replicas of a given Git repository, where we perform Git operations on all of the repositories at once and expect the outcome to be the same in all of them. While there exist hooks already for a certain subset of Git commands that could be used to implement a voting mechanism for this, many others currently don't have any mechanism for this. The above scenario is the motivation for the new "reference-transaction" hook that reaches directly into Git's reference transaction mechanism. The hook receives as parameter the current state the transaction was moved to ("prepared", "committed" or "aborted") and gets via its standard input all queued reference updates. While the exit code gets ignored in the "committed" and "aborted" states, a non-zero exit code in the "prepared" state will cause the transaction to be aborted prematurely. Given the usecase described above, a voting mechanism can now be implemented via this hook: as soon as it gets called, it will take all of stdin and use it to cast a vote to a central service. When all replicas of the repository agree, the hook will exit with zero, otherwise it will abort the transaction by returning non-zero. The most important upside is that this will catch _all_ commands writing references at once, allowing to implement strong consistency for reference updates via a single mechanism. In order to test the impact on the case where we don't have any "reference-transaction" hook installed in the repository, this commit introduce two new performance tests for git-update-refs(1). Run against an empty repository, it produces the following results: Test origin/master HEAD -------------------------------------------------------------------- 1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4% 1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0% The performance test p1400.2 creates, updates and deletes a branch a thousand times, thus averaging runtime of git-update-refs over 3000 invocations. p1400.3 instead calls `git-update-refs --stdin` three times and queues a thousand creations, updates and deletes respectively. As expected, p1400.3 consistently shows no noticeable impact, as for each batch of updates there's a single call to access(3P) for the negative hook lookup. On the other hand, for p1400.2, one can see an impact caused by this patchset. But doing five runs of the performance tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead ranged from -1.5% to +1.1%. These inconsistent performance numbers can be explained by the overhead of spawning 3000 processes. This shows that the overhead of assembling the hook path and executing access(3P) once to check if it's there is mostly outweighed by the operating system's overhead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 10:46:13 -07:00
Paolo Bonzini	08dc26061f	t4014: do not use "slave branch" nomenclature Git branches have been qualified as topic branches, integration branches, development branches, feature branches, release branches and so on. Git has a branch that is the master for development, but it is not the master of any "slave branch": Git does not have slave branches, and has never had, except for a single testcase that claims otherwise. :) Independent of any future change to the naming of the "master" branch, removing this sole appearance of the term is a strict improvement: it avoids divisive language, and talking about "feature branch" clarifies which developer workflow the test is trying to emulate. Reported-by: Till Maas <tmaas@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 10:26:34 -07:00
Junio C Hamano	653a3514cc	Merge branch 'dl/t-readme-spell-git-correctly' Doc updates. * dl/t-readme-spell-git-correctly: t/README: avoid poor-man's small caps GIT	2020-06-17 21:54:05 -07:00

... 29 30 31 32 33 ...

18232 Commits