Elijah Newren 557ac0350d merge-ort: begin performance work; instrument with trace2_region_* calls
Add some timing instrumentation for both merge-ort and diffcore-rename;
I used these to measure and optimize performance in both, and several
future patch series will build on these to reduce the timings of some
select testcases.

=== Setup ===

The primary testcase I used involved rebasing a random topic in the
linux kernel (consisting of 35 patches) against an older version.  I
added two variants, one where I rename a toplevel directory, and another
where I only rebase one patch instead of the whole topic.  The setup is
as follows:

  $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  $ git branch hwmon-updates fd8bdb23b91876ac1e624337bb88dc1dcc21d67e
  $ git branch hwmon-just-one fd8bdb23b91876ac1e624337bb88dc1dcc21d67e~34
  $ git branch base 4703d9119972bf586d2cca76ec6438f819ffa30e
  $ git switch -c 5.4-renames v5.4
  $ git mv drivers pilots  # Introduce over 26,000 renames
  $ git commit -m "Rename drivers/ to pilots/"
  $ git config merge.renameLimit 30000
  $ git config merge.directoryRenames true

=== Testcases ===

Now with REBASE standing for either "git rebase [--merge]" (using
merge-recursive) or "test-tool fast-rebase" (using merge-ort), the
testcases are:

Testcase #1: no-renames

  $ git checkout v5.4^0
  $ REBASE --onto HEAD base hwmon-updates

  Note: technically the name is misleading; there are some renames, but
  very few.  Rename detection only takes about half the overall time.

Testcase #2: mega-renames

  $ git checkout 5.4-renames^0
  $ REBASE --onto HEAD base hwmon-updates

Testcase #3: just-one-mega

  $ git checkout 5.4-renames^0
  $ REBASE --onto HEAD base hwmon-just-one

=== Timing results ===

Overall timings, using hyperfine (1 warmup run, 3 runs for mega-renames,
10 runs for the other two cases):

                       merge-recursive           merge-ort
    no-renames:       18.912 s ±  0.174 s    14.263 s ±  0.053 s
    mega-renames:   5964.031 s ± 10.459 s  5504.231 s ±  5.150 s
    just-one-mega:   149.583 s ±  0.751 s   158.534 s ±  0.498 s

A single re-run of each with some breakdowns:

                                    ---  no-renames  ---
                              merge-recursive   merge-ort
    overall runtime:              19.302 s        14.257 s
    inexact rename detection:      7.603 s         7.906 s
    everything else:              11.699 s         6.351 s

                                    --- mega-renames ---
                              merge-recursive   merge-ort
    overall runtime:            5950.195 s      5499.672 s
    inexact rename detection:   5746.309 s      5487.120 s
    everything else:             203.886 s        17.552 s

                                    --- just-one-mega ---
                              merge-recursive   merge-ort
    overall runtime:             151.001 s       158.582 s
    inexact rename detection:    143.448 s       157.835 s
    everything else:               7.553 s         0.747 s

=== Timing observations ===

0) Maximum speedup

The "everything else" row represents the maximum speedup we could
achieve if we were to somehow infinitely parallelize inexact rename
detection, but leave everything else alone.  The fact that this is so
much smaller than the real runtime (even in the case with virtually no
renames) makes it clear just how overwhelmingly large the time spent on
rename detection can be.

1) no-renames

1a) merge-ort is faster than merge-recursive, which is nice.  However,
this still should not be considered good enough.  Although the "merge"
backend to rebase (merge-recursive) is sometimes faster than the "apply"
backend, this is one of those cases where it is not.  In fact, even
merge-ort is slower.  The "apply" backend can complete this testcase in
    6.940 s ± 0.485 s
which is about 2x faster than merge-ort and 3x faster than
merge-recursive.  One goal of the merge-ort performance work will be to
make it faster than git-am on this (and similar) testcases.

2) mega-renames

2a) Obviously rename detection is a huge cost; it's where most the time
is spent.  We need to cut that down.  If we could somehow infinitely
parallelize it and drive its time to 0, the merge-recursive time would
drop to about 204s, and the merge-ort time would drop to about 17s.  I
think this particular stat shows I've subtly baked a couple performance
improvements into merge-ort and into fast-rebase already.

3) just-one-mega

3a) not much to say here, it just gives some flavor for how rebasing
only one patch compares to rebasing 35.

=== Goals ===

This patch is obviously just the beginning.  Here are some of my goals
that this measurement will help us achieve:

* Drive the cost of rename detection down considerably for merges
* After the above has been achieved, see if there are other slowness
  factors (which would have previously been overshadowed by rename
  detection costs) which we can then focus on and also optimize.
* Ensure our rebase testcase that requires little rename detection
  is noticeably faster with merge-ort than with apply-based rebase.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Taylor Blau <ttaylorr@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 23:30:06 -08:00
2020-12-08 14:48:16 -08:00
2020-11-09 14:06:25 -08:00
2020-09-22 12:36:28 -07:00
2020-08-13 11:02:15 -07:00
2020-12-08 15:11:21 -08:00
2020-12-08 15:11:17 -08:00
2020-12-18 15:15:18 -08:00
2020-11-21 15:14:38 -08:00
2020-09-02 14:39:25 -07:00
2020-11-18 13:32:53 -08:00
2020-08-11 18:04:11 -07:00
2020-12-17 15:06:40 -08:00
2020-08-10 10:23:57 -07:00
2021-01-06 23:33:44 -08:00
2021-01-06 23:33:44 -08:00
2020-08-27 14:04:49 -07:00
2020-07-06 22:09:13 -07:00
2020-08-24 14:54:31 -07:00
2020-11-12 09:40:06 -08:00
2020-07-30 19:18:06 -07:00
2020-12-18 15:15:18 -08:00
2020-11-02 13:17:44 -08:00
2020-11-21 15:14:38 -08:00
2020-08-27 14:04:49 -07:00
2020-07-30 19:18:06 -07:00
2020-07-28 15:02:17 -07:00
2020-09-03 12:37:04 -07:00
2020-08-10 10:23:57 -07:00
2020-11-09 14:06:25 -08:00
2020-11-11 13:18:38 -08:00
2021-01-06 23:33:44 -08:00
2020-07-30 19:18:06 -07:00
2020-11-30 13:55:54 -08:00
2020-07-30 19:18:06 -07:00
2020-06-25 12:27:47 -07:00
2020-12-14 15:01:03 -08:00
2020-07-30 19:18:06 -07:00
2020-07-28 15:02:17 -07:00
2020-12-08 15:11:18 -08:00
2020-11-21 15:14:38 -08:00
2020-11-25 15:24:52 -08:00
2020-12-08 15:11:18 -08:00
2020-08-28 14:07:09 -07:00
2020-07-30 19:18:06 -07:00
2020-07-06 22:09:13 -07:00
2020-11-21 15:14:38 -08:00
2020-12-14 10:21:36 -08:00
2020-10-27 15:09:50 -07:00
2021-01-06 23:33:44 -08:00
2020-10-27 15:09:49 -07:00
2020-12-08 15:11:20 -08:00
2020-11-21 15:14:38 -08:00
2020-07-28 15:02:17 -07:00
2020-08-11 18:04:11 -07:00
2020-09-02 14:39:25 -07:00
2020-07-30 19:18:06 -07:00
2020-07-30 19:18:06 -07:00
2020-07-30 19:18:06 -07:00
2020-07-28 15:02:17 -07:00
2020-11-02 13:17:46 -08:00
2020-10-05 14:01:52 -07:00

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks
Description
No description provided
Readme 669 MiB
Languages
C 50.5%
Shell 38.8%
Perl 4.4%
Tcl 3.2%
Python 0.8%
Other 2.1%