Commit Graph

11 Commits

Author SHA1 Message Date
SZEDER Gábor
e3106998ff line-log: show all line ranges touched by the same diff range
When line-level log is invoked with more than one disjoint line range
in the same file, and one of the commits happens to change that file
such that one diff range modifies more than one line range, then
changes to all modified line ranges should be shown, but only the
changes in the first modified line range are:

  $ git log --oneline -p
  80ca903 (HEAD -> master) Initial
  diff --git a/file b/file
  new file mode 100644
  index 0000000..00935f1
  --- /dev/null
  +++ b/file
  @@ -0,0 +1,10 @@
  +Line 1
  +Line 2
  +Line 3
  +Line 4
  +Line 5
  +Line 6
  +Line 7
  +Line 8
  +Line 9
  +Line 10
  $ git log --oneline -L1,2:file -L4,5:file -L7,8:file
  80ca903 (HEAD -> master) Initial

  diff --git a/file b/file
  --- /dev/null
  +++ b/file
  @@ -0,0 +1,2 @@
  +Line 1
  +Line 2

The line-log-specific diff printer is already clever enough to handle
the case when one line range covers multiple diff ranges, but the
possibility of one diff range touching multiple disjoint line ranges
was apparently overlooked.

Add the necessary condition to dump_diff_hacky_one() to handle this case
as well, and show all modified line ranges:

  $ git log --oneline -L1,2:file -L4,5:file -L7,8:file
  0f9a5b4 (HEAD -> master) Initial

  diff --git a/file b/file
  --- /dev/null
  +++ b/file
  @@ -0,0 +1,2 @@
  +Line 1
  +Line 2
  @@ -0,0 +4,2 @@
  +Line 4
  +Line 5
  @@ -0,0 +7,2 @@
  +Line 7
  +Line 8

This bug was already present in the initial line-log implementation
added in 2da1d1f6f (Implement line-history search (git log -L),
2013-03-28).  Interestingly, that commit already contained a canned
test case covering a similar scenario:

  "-L '/long f/',/^}/:a.c -L /main/,/^}/:a.c simple"

This test case looks for two line ranges in the same file, and both
trace back disjointly to the test repository's inital commit,
therefore changes to both line ranges should have been shown for the
initial commit, but only changes for the first line range are shown.
So this test case should have failed from the very beginning, but it
never did, because, unfortunately, the canned expected result is
incorrect, as it doesn't include changes for the second line range.

A similar test with a similarly incorrect canned expected result was
added later in 209618860c (log -L: fix overlapping input ranges,
2013-04-05).

Correct these two canned expected results to contain the changes for
the second line range for the initial commit as well.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-20 13:48:21 -07:00
SZEDER Gábor
ab60c693a2 line-log: fix assertion error
When line-level log is invoked with more than one disjoint line range
in the same file, and one of the commits happens to change that file
such that:

  - the last line of a line range R(n) immediately preceeds the first line
    modified or added by a hunk H, and
  - subtracting the number of lines added by hunk H from the start and
    end of the subsequent line range R(n+1) would result in a range
    overlapping with line range R(n),

then git aborts with an assertion error, because those overlapping
line ranges violate the invariants:

  $ git log --oneline -p
  73e4e2f (HEAD -> master) Add lines 6 7 8 9 10
  diff --git a/file b/file
  index 572d5d9..00935f1 100644
  --- a/file
  +++ b/file
  @@ -3,3 +3,8 @@ Line 2
   Line 3
   Line 4
   Line 5
  +Line 6
  +Line 7
  +Line 8
  +Line 9
  +Line 10
  66e3561 Add lines 1 2 3 4 5
  diff --git a/file b/file
  new file mode 100644
  index 0000000..572d5d9
  --- /dev/null
  +++ b/file
  @@ -0,0 +1,5 @@
  +Line 1
  +Line 2
  +Line 3
  +Line 4
  +Line 5
  $ git log --oneline -L3,5:file -L7,8:file
  git: line-log.c:73: range_set_append: Assertion `rs->nr == 0 || rs->ranges[rs->nr-1].end <= a' failed.
  Aborted (core dumped)

The line-log machinery encodes line and diff ranges internally as
[start, end) pairs, i.e. include 'start' but exclude 'end', and line
numbering starts at 0 (as opposed to the -LX,Y option, where it starts
at 1, IOW the parameter -L3,5 is represented internally as { start =
2, end = 5 }).

The reason for this assertion error and some related issues is that
there are a couple of places where 'end' is mistakenly considered to
be part of the range:

  - When a commit modifies an interesting path, the line-log machinery
    first checks which diff range (i.e. hunk) modify any line ranges.
    This is done in diff_ranges_filter_touched(), where the outer loop
    iterates over the diff ranges, and in each iteration the inner
    loop advances the line ranges supposedly until the current line
    range ends at or after the current diff range starts, and then the
    current diff and line ranges are checked for overlap.

    For HEAD in the above example the first line range [2, 5) ends
    just before the diff range [5, 10) starts, so the inner loop
    should advance, and then the second line range [6, 8) and the diff
    range should be checked for overlap.

    Unfortunately, the condition of the inner loop mistakenly
    considers 'end' as part of the line range, and, seeing the diff
    range starting at 5 and the line range ending at 5, it doesn't
    skip the first range.  Consequently, the diff range and the first
    line range are checked for overlap, and after that the outer loop
    runs out of diff ranges, and then the processing goes on in the
    false belief that this commit didn't touch any of the interesting
    line ranges.

    The line-log machinery later shifts the line ranges to account for
    any added/removed lines in the diff ranges preceeding each line
    range.  This leaves the first line range intact, but attempts to
    shift the second line range [6, 8) by 5 lines towards the
    beginning of the file, resulting in [1, 3), triggering the
    assertion error, because the two overlapping line ranges violate
    the invariants.

    Fix that loop condition in diff_ranges_filter_touched() to not
    treat 'end' as part of the line range.

  - With the above fix the assertion error is gone... but, alas, we
    now get stuck in an endless loop!

    This happens in range_set_difference(), where a couple of nested
    loops iterate over the line and diff ranges, and a condition is
    supposed to break the middle loop when the current line range ends
    before the current diff range, so processing could continue with
    the next line range.

    For HEAD in the above example the first line range [2, 5) ends
    just before the diff range [5, 10) starts, so this condition
    should trigger and break the middle loop.

    Unfortunately, just like in the case of the assertion error, this
    conditions mistakenly considers 'end' as part of the line range,
    and, seeing the line range ending at 5 and the diff range starting
    at 5, it doesn't break the loop, which will then go on and on.

    Fix this condition in range_set_difference() to not treat 'end' as
    part of the line range.

  - With the above fix the endless loop is gone... but, alas, the
    output is now wrong, as it shows both line ranges for HEAD, even
    though the first line range is not modified by that commit:

      $ git log --oneline -L3,5:file -L7,8:file
      73e4e2f (HEAD -> master) Add lines 6 7 8 9 10

      diff --git a/file b/file
      --- a/file
      +++ b/file
      @@ -3,3 +3,3 @@
       Line 3
       Line 4
       Line 5
      @@ -6,0 +7,2 @@
      +Line 7
      +Line 8
      66e3561 Add lines 1 2 3 4 5

      diff --git a/file b/file
      --- /dev/null
      +++ b/file
      @@ -0,0 +3,3 @@
      +Line 3
      +Line 4
      +Line 5

    In dump_diff_hacky_one() a couple of nested loops are responsible
    for finding and printing the modified line ranges: the big outer
    loop iterates over all line ranges, and the first inner loop skips
    over the diff ranges that end before the start of the current line
    range.  This is followed by a condition checking whether the
    current diff range starts after the end of the current line range,
    which, when fulfilled, continues and advances the outer loop to
    the next line range.

    For HEAD in the above example the first line range [2, 5) ends
    just before the diff range [5, 10), so this condition should
    trigger, and the outer loop should advance to the second line
    range.

    Unfortunately, just like in the previous cases, this condition
    mistakenly considers 'end' as part of the line range, and, seeing
    the first line range ending at 5 and the diff range starting at 5,
    it doesn't continue to advance the outher loop, but goes on to
    show the (unmodified) first line range.

    Fix this condition to not treat 'end' as part of the line range,
    just like in the previous cases.

After all this the command in the above example finally finishes and
produces the right output:

  $ git log --oneline -L3,5:file -L7,8:file
  73e4e2f (HEAD -> master) Add lines 6 7 8 9 10

  diff --git a/file b/file
  --- a/file
  +++ b/file
  @@ -6,0 +7,2 @@
  +Line 7
  +Line 8
  66e3561 Add lines 1 2 3 4 5

  diff --git a/file b/file
  --- /dev/null
  +++ b/file
  @@ -0,0 +3,3 @@
  +Line 3
  +Line 4
  +Line 5

Add a canned test similar to the above example, with the line ranges
adjusted to the test repository's history.

Reported-by: Evgeni Chasnovski <evgeni.chasnovski@gmail.com>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-20 13:48:21 -07:00
Johannes Schindelin
8f37854b18 t4*: adjust the references to the default branch name "main"
Carefully excluding t4013 and t4015, which see independent development
elsewhere at the time of writing, we use `main` as the default branch
name in t4*. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t4*.sh t4211/*.export &&
	   git checkout HEAD -- t4013\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
brian m. carlson
dfa5f53e78 t4211: add test cases for SHA-256
There are already files containing example output for SHA-1.  Add test
files providing example output for SHA-256 as well and adjust the test
to look up the appropriate ones based on the algorithm in use.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-07 11:07:30 -08:00
brian m. carlson
f743e8f5b3 t4211: move SHA-1-specific test cases into a directory
In preparation for adding SHA-256 support to this test, let's move the
SHA-1-specific expected output into a directory called "sha1".  This
will allow us to add a similar directory for SHA-256 as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-07 11:07:30 -08:00
Eric Sunshine
18d472db6f t4211: fix broken test when one -L range is subset of another
t4211 attempts to test multiple git-log -L ranges where one range is a
superset of the other, and falsely succeeds because its "expected"
output is incorrect.

Overlapping -L ranges handed to git-log are coalesced by
line-log.c:sort_and_merge_range_set() into a set of non-overlapping,
disjoint ranges. When one range is a subset of another,
sort_and_merge_range_set() should coalesce both ranges to the superset
range, but instead the coalesced range often is incorrectly truncated to
the end of the subset range. For example, ranges 2-8 and 3-4 are
coalesced incorrectly to 2-4.

One can observe this incorrect behavior with git-log -L using the test
repository created by t4211. The superset/subset ranges t4211 employs
are 4-$ and 8-12 (where $ represents end-of-file). The coalesced range
should be 4-$. Manually invoking git-log with the same ranges the test
employs, we see:

  % git log -L 4:a.c simple |
    awk '/^commit [0-9a-f]{40}/ { print substr($2,1,7) }'
  4659538
  100b61a
  39b6eb2
  a6eb826
  f04fb20
  de4c48a

  % git log -L 8,12:a.c simple | awk ...
  f04fb20
  de4c48a

  % git log -L 4:a.c -L 8,12:a.c simple | awk ...
  a6eb826
  f04fb20
  de4c48a

This last output is incorrect. 8-12 is a subset of 4-$, hence the output
of the coalesced range should be the same as the 4-$ output shown first.
In fact, the above incorrect output is the truncated bogus range 4-12:

  % git log -L 4,12:a.c simple | awk ...
  a6eb826
  f04fb20
  de4c48a

Fix the test to correctly fail in the presence of the
sort_and_merge_range_set() coalescing bug. Do so by changing the
"expected" output to the commits mentioned in the 4-$ output above.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-09 09:24:59 -07:00
Thomas Rast
d51c5274e4 log -L: test merge of parallel modify/rename
This tests a toy example of a history like

  * Merge
  | \
  |  * Modify foo
  |  |
  *  | Rename foo->bar
  | /
  * Create foo

Current log -L fails on this; we'll fix it in the next commit.

Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-12 11:34:37 -07:00
Thomas Rast
035ff3987b t4211: pass -M to 'git log -M -L...' test
Embarrassingly, the -M test did not actually invoke -M, and thus not
really test the feature.

Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-12 11:34:12 -07:00
Thomas Rast
209618860c log -L: fix overlapping input ranges
The existing code was too defensive, and would trigger the assert in
range_set_append() if the user gave overlapping ranges.

The intent was always to define overlapping ranges as just the union
of all of them, as evidenced by the call to sort_and_merge_range_set().
(Which was already used, unlike what the comment said.)

Fix by splitting out the meat of range_set_append() to a new _unsafe()
function that lacks the paranoia.  sort_and_merge_range_set will fix
up the ranges, so we don't need the checks there.

Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-05 10:39:09 -07:00
Thomas Rast
13b8f68c1f log -L: :pattern:file syntax to find by funcname
This new syntax finds a funcname matching /pattern/, and then takes from there
up to (but not including) the next funcname.  So you can say

  git log -L:main:main.c

and it will dig up the main() function and show its line-log, provided
there are no other funcnames matching 'main'.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-28 10:30:04 -07:00
Thomas Rast
12da1d1f6f Implement line-history search (git log -L)
This is a rewrite of much of Bo's work, mainly in an effort to split
it into smaller, easier to understand routines.

The algorithm is built around the struct range_set, which encodes a
series of line ranges as intervals [a,b).  This is used in two
contexts:

* A set of lines we are tracking (which will change as we dig through
  history).
* To encode diffs, as pairs of ranges.

The main routine is range_set_map_across_diff().  It processes the
diff between a commit C and some parent P.  It determines which diff
hunks are relevant to the ranges tracked in C, and computes the new
ranges for P.

The algorithm is then simply to process history in topological order
from newest to oldest, computing ranges and (partial) diffs.  At
branch points, we need to merge the ranges we are watching.  We will
find that many commits do not affect the chosen ranges, and mark them
TREESAME (in addition to those already filtered by pathspec limiting).
Another pass of history simplification then gets rid of such commits.

This is wired as an extra filtering pass in the log machinery.  This
currently only reduces code duplication, but should allow for other
simplifications and options to be used.

Finally, we hook a diff printer into the output chain.  Ideally we
would wire directly into the diff logic, to optionally use features
like word diff.  However, that will require some major reworking of
the diff chain, so we completely replace the output with our own diff
for now.

As this was a GSoC project, and has quite some history by now, many
people have helped.  In no particular order, thanks go to

  Jakub Narebski <jnareb@gmail.com>
  Jens Lehmann <Jens.Lehmann@web.de>
  Jonathan Nieder <jrnieder@gmail.com>
  Junio C Hamano <gitster@pobox.com>
  Ramsay Jones <ramsay@ramsay1.demon.co.uk>
  Will Palmer <wmpalmer@gmail.com>

Apologies to everyone I forgot.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-28 10:29:22 -07:00