Commit Graph

6888 Commits

Author SHA1 Message Date
Johannes Schindelin
ff685ecff9 Merge branch 'unhidden-git'
It has been reported that core.hideDotFiles=false stopped working...
This topic branch fixes it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-08-21 17:43:14 +02:00
Johannes Schindelin
02a2d851f6 Merge branch 'status-no-lock-index'
This branch allows third-party tools to call `git status
--no-lock-index` to avoid lock contention with the interactive Git usage
of the actual human user.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-08-21 17:43:13 +02:00
Johannes Schindelin
7eed22d5d6 Merge 'release-gc-repack' into HEAD 2018-08-21 17:43:12 +02:00
Johannes Schindelin
520ddeb0d1 Merge branch 'clean-long-paths'
This addresses https://github.com/git-for-windows/git/issues/521

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-08-21 17:43:11 +02:00
Johannes Schindelin
5450c2a95b Merge 'remote-hg-prerequisites' into HEAD
These fixes were necessary for Sverre Rabbelier's remote-hg to work,
but for some magic reason they are not necessary for the current
remote-hg. Makes you wonder how that one gets away with it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-08-21 16:43:48 +02:00
Johannes Schindelin
9dbfbe6904 mingw: respect core.hidedotfiles = false in git-init again
This is a brown paper bag. When adding the tests, we actually failed
to verify that the config variable is heeded in git-init at all. And
when changing the original patch that marked the .git/ directory as
hidden after reading the config, it was lost on this developer that
the new code would use the hide_dotfiles variable before the config
was read.

The fix is obvious: read the (limited, pre-init) config *before*
creating the .git/ directory.

This fixes https://github.com/git-for-windows/git/issues/789

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-08-21 16:37:31 +02:00
Johannes Schindelin
31d5cb9971 status: carry the --no-lock-index option for backwards-compatibility
When a third-party tool periodically runs `git status` in order to keep
track of the state of the working tree, it is a bad idea to lock the
index: it might interfere with interactive commands executed by the
user, e.g. when the user wants to commit files.

Git for Windows introduced the `--no-lock-index` option a long time ago
to fix that (it made it into Git for Windows v2.9.2(3)) by simply
avoiding to write that file.

The downside is that the periodic `git status` calls will be a little
bit more wasteful because they may have to refresh the index repeatedly,
only to throw away the updates when it exits. This cannot really be
helped, though, as tools wanting to get a periodic update of the status
have no way to predict when the user may want to lock the index herself.

Sadly, a competing approach was submitted (by somebody who apparently
has less work on their plate than this maintainer) that made it into
v2.15.0 but is *different*: instead of a `git status`-only option, it is
an option that comes *before* the Git command and is called differently,
too.

Let's give previous users a chance to upgrade to newer Git for Windows
versions by handling the `--no-lock-index` option, still, though with a
big fat warning.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-08-21 16:37:30 +02:00
Johannes Schindelin
35b8b9f2ba gc/repack: release packs when needed
On Windows, files cannot be removed nor renamed if there are still
handles held by a process. To remedy that, we introduced the
close_all_packs() function.

Earlier, we made sure that the packs are released just before `git gc`
is spawned, in case that gc wants to remove no-longer needed packs.

But this developer forgot that gc itself also needs to let go of packs,
e.g. when consolidating all packs via the --aggressive option.

Likewise, `git repack -d` wants to delete obsolete packs and therefore
needs to close all pack handles, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-08-21 16:37:20 +02:00
Johannes Schindelin
b186e007f6 remove_dirs: do not swallow error when stat() failed
Without an error message when stat() failed, e.g. `git clean` would
abort without an error message, leaving the user quite puzzled.

This fixes https://github.com/git-for-windows/git/issues/521

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-08-21 16:36:38 +02:00
Sverre Rabbelier
54096c3e23 remote-helper: check helper status after import/export
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
2018-08-21 16:34:26 +02:00
Johannes Schindelin
f48536d808 fast-export: do not refer to non-existing marks
When calling `git fast-export a..a b` when a and b refer to the same
commit, nothing would be exported, and an incorrect reset line would
be printed for b ('from :0').

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
2018-08-21 16:31:26 +02:00
nalla
3ead4cf831 mingw: explicitly fflush stdout
For performance reasons `stdout` is not unbuffered by default. That leads
to problems if after printing to `stdout` a read on `stdin` is performed.

For that reason interactive commands like `git clean -i` do not function
properly anymore if the `stdout` is not flushed by `fflush(stdout)` before
trying to read from `stdin`.

In the case of `git clean -i` all reads on `stdin` were preceded by a
`fflush(stdout)` call.

Signed-off-by: nalla <nalla@hamal.uberspace.de>
2018-08-21 16:31:24 +02:00
Karsten Blees
7635d9fe20 add infrastructure for read-only file system level caches
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.

This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.

Enable read-only sections for 'git status' and preload_index.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-08-21 16:23:44 +02:00
Junio C Hamano
39e415cfd1 Merge branch 'nd/cherry-pick-quit-fix'
"git cherry-pick --quit" failed to remove CHERRY_PICK_HEAD even
though we won't be in a cherry-pick session after it returns, which
has been corrected.

* nd/cherry-pick-quit-fix:
  cherry-pick: fix --quit not deleting CHERRY_PICK_HEAD
2018-08-20 12:41:34 -07:00
Junio C Hamano
85c54ecc5f Merge branch 'sb/submodule-cleanup'
A few preliminary minor clean-ups in the area around submodules.

* sb/submodule-cleanup:
  builtin/submodule--helper: remove stray new line
  t7410: update to new style
2018-08-20 12:41:33 -07:00
Junio C Hamano
36f0f344e7 Merge branch 'jt/repack-promisor-packs'
After a partial clone, repeated fetches from promisor remote would
have accumulated many packfiles marked with .promisor bit without
getting them coalesced into fewer packfiles, hurting performance.
"git repack" now learned to repack them.

* jt/repack-promisor-packs:
  repack: repack promisor objects if -a or -A is set
  repack: refactor setup of pack-objects cmd
2018-08-20 12:40:31 -07:00
Junio C Hamano
81eab6871e Merge branch 'js/range-diff'
"git tbdiff" that lets us compare individual patches in two
iterations of a topic has been rewritten and made into a built-in
command.

* js/range-diff: (21 commits)
  range-diff: use dim/bold cues to improve dual color mode
  range-diff: make --dual-color the default mode
  range-diff: left-pad patch numbers
  completion: support `git range-diff`
  range-diff: populate the man page
  range-diff --dual-color: skip white-space warnings
  range-diff: offer to dual-color the diffs
  diff: add an internal option to dual-color diffs of diffs
  color: add the meta color GIT_COLOR_REVERSE
  range-diff: use color for the commit pairs
  range-diff: add tests
  range-diff: do not show "function names" in hunk headers
  range-diff: adjust the output of the commit pairs
  range-diff: suppress the diff headers
  range-diff: indent the diffs just like tbdiff
  range-diff: right-trim commit messages
  range-diff: also show the diff between patches
  range-diff: improve the order of the shown commits
  range-diff: first rudimentary implementation
  Introduce `range-diff` to compare iterations of a topic branch
  ...
2018-08-20 11:33:53 -07:00
Junio C Hamano
dc0f6f9e1d Merge branch 'nd/no-the-index'
The more library-ish parts of the codebase learned to work on the
in-core index-state instance that is passed in by their callers,
instead of always working on the singleton "the_index" instance.

* nd/no-the-index: (24 commits)
  blame.c: remove implicit dependency on the_index
  apply.c: remove implicit dependency on the_index
  apply.c: make init_apply_state() take a struct repository
  apply.c: pass struct apply_state to more functions
  resolve-undo.c: use the right index instead of the_index
  archive-*.c: use the right repository
  archive.c: avoid access to the_index
  grep: use the right index instead of the_index
  attr: remove index from git_attr_set_direction()
  entry.c: use the right index instead of the_index
  submodule.c: use the right index instead of the_index
  pathspec.c: use the right index instead of the_index
  unpack-trees: avoid the_index in verify_absent()
  unpack-trees: convert clear_ce_flags* to avoid the_index
  unpack-trees: don't shadow global var the_index
  unpack-trees: add a note about path invalidation
  unpack-trees: remove 'extern' on function declaration
  ls-files: correct index argument to get_convert_attr_ascii()
  preload-index.c: use the right index instead of the_index
  dir.c: remove an implicit dependency on the_index in pathspec code
  ...
2018-08-20 11:33:53 -07:00
Junio C Hamano
0c54cdaf65 Merge branch 'jk/for-each-object-iteration'
The API to iterate over all objects learned to optionally list
objects in the order they appear in packfiles, which helps locality
of access if the caller accesses these objects while as objects are
enumerated.

* jk/for-each-object-iteration:
  for_each_*_object: move declarations to object-store.h
  cat-file: use a single strbuf for all output
  cat-file: split batch "buf" into two variables
  cat-file: use oidset check-and-insert
  cat-file: support "unordered" output for --batch-all-objects
  cat-file: rename batch_{loose,packed}_object callbacks
  t1006: test cat-file --batch-all-objects with duplicates
  for_each_packed_object: support iterating in pack-order
  for_each_*_object: give more comprehensive docstrings
  for_each_*_object: take flag arguments as enum
  for_each_*_object: store flag definitions in a single location
2018-08-20 11:33:52 -07:00
Junio C Hamano
c757aa2d12 Merge branch 'js/pull-rebase-type-shorthand'
"git pull --rebase=interactive" learned "i" as a short-hand for
"interactive".

* js/pull-rebase-type-shorthand:
  pull --rebase=<type>: allow single-letter abbreviations for the type
2018-08-17 13:09:59 -07:00
Junio C Hamano
8963bb0c2d Merge branch 'rs/parse-opt-lithelp'
The parse-options machinery learned to refrain from enclosing
placeholder string inside a "<bra" and "ket>" pair automatically
without PARSE_OPT_LITERAL_ARGHELP.  Existing help text for option
arguments that are not formatted correctly have been identified and
fixed.

* rs/parse-opt-lithelp:
  parse-options: automatically infer PARSE_OPT_LITERAL_ARGHELP
  shortlog: correct option help for -w
  send-pack: specify --force-with-lease argument help explicitly
  pack-objects: specify --index-version argument help explicitly
  difftool: remove angular brackets from argument help
  add, update-index: fix --chmod argument help
  push: use PARSE_OPT_LITERAL_ARGHELP instead of unbalanced brackets
2018-08-17 13:09:56 -07:00
Stefan Beller
e6b09b184d builtin/submodule--helper: remove stray new line
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-16 10:42:56 -07:00
Nguyễn Thái Ngọc Duy
3e7dd99208 cherry-pick: fix --quit not deleting CHERRY_PICK_HEAD
--quit is supposed to be --abort but without restoring HEAD. Leaving
CHERRY_PICK_HEAD behind could make other commands mistake that
cherry-pick is still ongoing (e.g. "git commit --amend" will refuse to
work). Clean it too.

For --abort, this job of deleting CHERRY_PICK_HEAD is on "git reset"
so we don't need to do anything else. But let's add extra checks in
--abort tests to confirm.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-16 10:02:55 -07:00
Junio C Hamano
b160b6e69d Merge branch 'jt/connectivity-check-after-unshallow'
"git fetch" sometimes failed to update the remote-tracking refs,
which has been corrected.

* jt/connectivity-check-after-unshallow:
  fetch-pack: unify ref in and out param
2018-08-15 15:08:28 -07:00
Junio C Hamano
11ea82ae37 Merge branch 'rs/remote-mv-leakfix'
Leakfix.

* rs/remote-mv-leakfix:
  remote: clear string_list after use in mv()
2018-08-15 15:08:28 -07:00
Junio C Hamano
2e2c24f82a Merge branch 'nd/pack-objects-threading-doc'
Doc fix.

* nd/pack-objects-threading-doc:
  pack-objects: document about thread synchronization
2018-08-15 15:08:27 -07:00
Junio C Hamano
d6628c99fa Merge branch 'jt/tag-following-with-proto-v2-fix'
The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement.  "git
fetch $remote branch:branch" that asks tags that point into the
history leading to the "branch" automatically followed sent to
narrow prefix and broke the tag following, which has been fixed.

* jt/tag-following-with-proto-v2-fix:
  fetch: send "refs/tags/" prefix upon CLI refspecs
  t5702: test fetch with multiple refspecs at a time
2018-08-15 15:08:25 -07:00
Junio C Hamano
7d020f5a78 Merge branch 'jk/size-t'
Code clean-up to use size_t/ssize_t when they are the right type.

* jk/size-t:
  strbuf_humanise: use unsigned variables
  pass st.st_size as hint for strbuf_readlink()
  strbuf_readlink: use ssize_t
  strbuf: use size_t for length in intermediate variables
  reencode_string: use size_t for string lengths
  reencode_string: use st_add/st_mult helpers
2018-08-15 15:08:25 -07:00
Junio C Hamano
4bea8485e3 Merge branch 'nd/i18n'
Many more strings are prepared for l10n.

* nd/i18n: (23 commits)
  transport-helper.c: mark more strings for translation
  transport.c: mark more strings for translation
  sha1-file.c: mark more strings for translation
  sequencer.c: mark more strings for translation
  replace-object.c: mark more strings for translation
  refspec.c: mark more strings for translation
  refs.c: mark more strings for translation
  pkt-line.c: mark more strings for translation
  object.c: mark more strings for translation
  exec-cmd.c: mark more strings for translation
  environment.c: mark more strings for translation
  dir.c: mark more strings for translation
  convert.c: mark more strings for translation
  connect.c: mark more strings for translation
  config.c: mark more strings for translation
  commit-graph.c: mark more strings for translation
  builtin/replace.c: mark more strings for translation
  builtin/pack-objects.c: mark more strings for translation
  builtin/grep.c: mark strings for translation
  builtin/config.c: mark more strings for translation
  ...
2018-08-15 15:08:23 -07:00
Junio C Hamano
2d7a20258f Merge branch 'bw/clone-ref-prefixes'
The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement.  "git
clone" when learned to speak v2 forgot to do so, which has been
corrected.

* bw/clone-ref-prefixes:
  clone: send ref-prefixes when using protocol v2
2018-08-15 15:08:23 -07:00
Junio C Hamano
1689c22c1c Merge branch 'jk/core-use-replace-refs'
A new configuration variable core.usereplacerefs has been added,
primarily to help server installations that want to ignore the
replace mechanism altogether.

* jk/core-use-replace-refs:
  add core.usereplacerefs config option
  check_replace_refs: rename to read_replace_refs
  check_replace_refs: fix outdated comment
2018-08-15 15:08:23 -07:00
Jeff King
0889aae1cd for_each_*_object: move declarations to object-store.h
The for_each_loose_object() and for_each_packed_object()
functions are meant to be part of a unified interface: they
use the same set of for_each_object_flags, and it's not
inconceivable that we might one day add a single
for_each_object() wrapper around them.

Let's put them together in a single file, so we can avoid
awkwardness like saying "the flags for this function are
over in cache.h". Moving the loose functions to packfile.h
is silly. Moving the packed functions to cache.h works, but
makes the "cache.h is a kitchen sink" problem worse. The
best place is the recently-created object-store.h, since
these are quite obviously related to object storage.

The for_each_*_in_objdir() functions do not use the same
flags, but they are logically part of the same interface as
for_each_loose_object(), and share callback signatures. So
we'll move those, as well, as they also make sense in
object-store.h.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14 12:29:57 -07:00
Jeff King
79ed0a5e26 cat-file: use a single strbuf for all output
When we're in batch mode, we end up in batch_object_write()
for each object, which allocates its own strbuf for each
call. Instead, we can provide a single "scratch" buffer that
gets reused for each output. When running:

  git cat-file --batch-all-objects --batch-check='%(objectname)'

on git.git, my best-of-five time drops from:

  real	0m0.171s
  user	0m0.159s
  sys	0m0.012s

to:

  real	0m0.133s
  user	0m0.121s
  sys	0m0.012s

Note that we could do this just by putting the "scratch"
pointer into "struct expand_data", but I chose instead to
add an extra parameter to the callstack. That's more
verbose, but it makes it a bit more obvious what is going
on, which in turn makes it easy to see where we need to be
releasing the string in the caller (right after the loop
which uses it in each case).

Based-on-a-patch-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14 12:29:54 -07:00
Jeff King
54d2f0d945 cat-file: split batch "buf" into two variables
We use the "buf" strbuf for two things: to read incoming
lines, and as a scratch space for test-expanding the
user-provided format. Let's split this into two variables
with descriptive names, which makes their purpose and
lifetime more clear.

It will also help in a future patch when we start using the
"output" buffer for more expansions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14 12:29:00 -07:00
Jeff King
ced9fff75d cat-file: use oidset check-and-insert
We don't need to check if the oidset has our object before
we insert it; that's done as part of the insertion. We can
just rely on the return value from oidset_insert(), which
saves one hash lookup per object.

This measurable speedup is tiny and within the run-to-run
noise, but the result is simpler to read, too.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14 12:27:53 -07:00
Nguyễn Thái Ngọc Duy
ecbbc0a53b blame.c: remove implicit dependency on the_index
Side note, since we gain access to the right repository, we can stop
rely on the_repository in this code as well.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:44 -07:00
Nguyễn Thái Ngọc Duy
82ea77eca7 apply.c: make init_apply_state() take a struct repository
We're moving away from the_index in this code. "struct index_state *"
could be added to struct apply_state. But let's aim long term and put
struct repository here instead so that we could even avoid more global
states in the future. The index will be available via
apply_state->repo->index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:44 -07:00
Nguyễn Thái Ngọc Duy
b612ee202a archive.c: avoid access to the_index
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy
a4009b0b45 grep: use the right index instead of the_index
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy
c4500e251f attr: remove index from git_attr_set_direction()
Since attr checking API now take the index, there's no need to set an
index in advance with this call. Most call sites are straightforward
because they either pass the_index or NULL (which defaults back to
the_index previously). There's only one suspicious call site in
unpack-trees.c where it sets a different index.

This code in unpack-trees is about to check out entries from the
new/temporary index after merging is done in it. The attributes will
be used by entry.c code to do crlf conversion if needed. entry.c now
respects struct checkout's istate field, and this field is correctly
set in unpack-trees.c, there should be no regression from this change.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy
74cfc0ee1d entry.c: use the right index instead of the_index
checkout-index.c needs update because if checkout->istate is NULL,
ie_match_stat() will crash. Previously this is ie_match_stat(&the_index, ..)
so it will not crash, but it is not technically correct either.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy
a52b321d2e ls-files: correct index argument to get_convert_attr_ascii()
write_eolinfo() does take an istate as function argument and it should
be used instead of the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy
6d2df284e7 dir.c: remove an implicit dependency on the_index in pathspec code
Make the match_patchspec API and friends take an index_state instead
of assuming the_index in dir.c. All external call sites are converted
blindly to keep the patch simple and retain current behavior.
Individual call sites may receive further updates to use the right
index instead of the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:42 -07:00
Nguyễn Thái Ngọc Duy
7f944e264e convert.c: remove an implicit dependency on the_index
Make the convert API take an index_state instead of assuming the_index
in convert.c. All external call sites are converted blindly to keep
the patch simple and retain current behavior. Individual call sites
may receive further updates to use the right index instead of
the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:42 -07:00
Nguyễn Thái Ngọc Duy
7a400a2c02 attr: remove an implicit dependency on the_index
Make the attr API take an index_state instead of assuming the_index in
attr code. All call sites are converted blindly to keep the patch
simple and retain current behavior. Individual call sites may receive
further updates to use the right index instead of the_index.

There is one ugly temporary workaround added in attr.c that needs some
more explanation.

Commit c24f3abace (apply: file commited with CRLF should roundtrip
diff and apply - 2017-08-19) forces one convert_to_git() call to NOT
read the index at all. But what do you know, we read it anyway by
falling back to the_index. When "istate" from convert_to_git is now
propagated down to read_attr_from_array() we will hit segfault
somewhere inside read_blob_data_from_index.

The right way of dealing with this is to kill "use_index" variable and
only follow "istate" but at this stage we are not ready for that:
while most git_attr_set_direction() calls just passes the_index to be
assigned to use_index, unpack-trees passes a different one which is
used by entry.c code, which has no way to know what index to use if we
delete use_index. So this has to be done later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:42 -07:00
Nguyễn Thái Ngọc Duy
ff7fe37b05 diff.c: move read_index() code back to the caller
This code is only needed for diff-tree (since f0c6b2a2fd ([PATCH]
Optimize diff-tree -[CM] --stdin - 2005-05-27)). Let the caller do the
preparation instead and avoid read_index() in diff.c code.

read_index() should be avoided (in addition to the_index) because it
uses get_index_file() underneath to get the path $GIT_DIR/index. This
effectively pulls the_repository in and may become the only reason to
pull a 'struct repository *' in diff.c. Let's keep the dependencies as
few as possible and kick it back to diff-tree.c

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:42 -07:00
Jeff King
0750bb5b51 cat-file: support "unordered" output for --batch-all-objects
If you're going to access the contents of every object in a
packfile, it's generally much more efficient to do so in
pack order, rather than in hash order. That increases the
locality of access within the packfile, which in turn is
friendlier to the delta base cache, since the packfile puts
related deltas next to each other. By contrast, hash order
is effectively random, since the sha1 has no discernible
relationship to the content.

This patch introduces an "--unordered" option to cat-file
which iterates over packs in pack-order under the hood. You
can see the results when dumping all of the file content:

  $ time ./git cat-file --batch-all-objects --buffer --batch | wc -c
  6883195596

  real	0m44.491s
  user	0m42.902s
  sys	0m5.230s

  $ time ./git cat-file --unordered \
                        --batch-all-objects --buffer --batch | wc -c
  6883195596

  real	0m6.075s
  user	0m4.774s
  sys	0m3.548s

Same output, different order, way faster. The same speed-up
applies even if you end up accessing the object content in a
different process, like:

  git cat-file --batch-all-objects --buffer --batch-check |
  grep blob |
  git cat-file --batch='%(objectname) %(rest)' |
  wc -c

Adding "--unordered" to the first command drops the runtime
in git.git from 24s to 3.5s.

  Side note: there are actually further speedups available
  for doing it all in-process now. Since we are outputting
  the object content during the actual pack iteration, we
  know where to find the object and could skip the extra
  lookup done by oid_object_info(). This patch stops short
  of that optimization since the underlying API isn't ready
  for us to make those sorts of direct requests.

So if --unordered is so much better, why not make it the
default? Two reasons:

  1. We've promised in the documentation that --batch-all-objects
     outputs in hash order. Since cat-file is plumbing,
     people may be relying on that default, and we can't
     change it.

  2. It's actually _slower_ for some cases. We have to
     compute the pack revindex to walk in pack order. And
     our de-duplication step uses an oidset, rather than a
     sort-and-dedup, which can end up being more expensive.
     If we're just accessing the type and size of each
     object, for example, like:

       git cat-file --batch-all-objects --buffer --batch-check

     my best-of-five warm cache timings go from 900ms to
     1100ms using --unordered. Though it's possible in a
     cold-cache or under memory pressure that we could do
     better, since we'd have better locality within the
     packfile.

And one final question: why is it "--unordered" and not
"--pack-order"? The answer is again two-fold:

  1. "pack order" isn't a well-defined thing across the
     whole set of objects. We're hitting loose objects, as
     well as objects in multiple packs, and the only
     ordering we're promising is _within_ a single pack. The
     rest is apparently random.

  2. The point here is optimization. So we don't want to
     promise any particular ordering, but only to say that
     we will choose an ordering which is likely to be
     efficient for accessing the object content. That leaves
     the door open for further changes in the future without
     having to add another compatibility option.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 13:48:31 -07:00
Jeff King
b1adb38458 cat-file: rename batch_{loose,packed}_object callbacks
We're not really doing the batch-show operation in these
callbacks, but just collecting the set of objects. That
distinction will become more important in a future patch, so
let's rename them now to avoid cluttering that diff.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 13:48:30 -07:00
Johannes Schindelin
275267937b range-diff: make --dual-color the default mode
After using this command extensively for the last two months, this
developer came to the conclusion that even if the dual color mode still
leaves a lot of room for confusion about what was actually changed, the
non-dual color mode is substantially worse in that regard.

Therefore, we really want to make the dual color mode the default.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 10:44:52 -07:00
Johannes Schindelin
31cf61a080 range-diff: offer to dual-color the diffs
When showing what changed between old and new commits, we show a diff of
the patches. This diff is a diff between diffs, therefore there are
nested +/- signs, and it can be relatively hard to understand what is
going on.

With the --dual-color option, the preimage and the postimage are colored
like the diffs they are, and the *outer* +/- sign is inverted for
clarity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 10:44:51 -07:00