Johannes Schindelin a3795bf0e6 tests(mingw): avoid very slow mingw_test_cmp
When Git's test suite uses `test_cmp`, it is not actually trying to
compare binary files as the name `cmp` would suggest to users familiar
with Unix' tools, but the tests instead verify that actual output
matches the expected text.

On Unix, `cmp` works well enough for Git's purposes because only Line
Feed characters are used as line endings. However, on Windows, while
most tools accept Line Feeds as line endings, many tools produce
Carriage Return + Line Feed line endings, including some of the tools
used by the test suite (which are therefore provided via Git for Windows
SDK). Therefore, `cmp` would frequently fail merely due to different
line endings.

To accommodate for that, the `mingw_test_cmp` function was introduced
into Git's test suite to perform a line-by-line comparison that ignores
line endings. This function is a Bash function that is only used on
Windows, everywhere else `cmp` is used.

This is a double whammy because `cmp` is fast, and `mingw_test_cmp` is
slow, even more so on Windows because it is a Bash script function, and
Bash scripts are known to run particularly slowly on Windows due to
Bash's need for the POSIX emulation layer provided by the MSYS2 runtime.

The commit message of 32ed3314c1 (t5351: avoid using `test_cmp` for
binary data, 2022-07-29) provides an illuminating account of the
consequences: On Windows, the platform on which Git could really use all
the help it can get to improve its performance, the time spent on one
entire test script was reduced from half an hour to less than half a
minute merely by avoiding a single call to `mingw_test_cmp` in but a
single test case.

Learning the lesson to avoid shell scripting wherever possible, the Git
for Windows project implemented a minimal replacement for
`mingw_test_cmp` in the form of a `test-tool` subcommand that parses the
input files line by line, ignoring line endings, and compares them.
Essentially the same thing as `mingw_test_cmp`, but implemented in
C instead of Bash. This solution served the Git for Windows project
well, over years.

However, when this solution was finally upstreamed, the conclusion was
reached that a change to use `git diff --no-index` instead of
`mingw_test_cmp` was more easily reviewed and hence should be used
instead.

The reason why this approach was not even considered in Git for Windows
is that in 2007, there was already a motion on the table to use Git's
own diff machinery to perform comparisons in Git's test suite, but it
was dismissed in https://lore.kernel.org/git/xmqqbkrpo9or.fsf@gitster.g/
as undesirable because tests might potentially succeed due to bugs in
the diff machinery when they should not succeed, and those bugs could
therefore hide regressions that the tests try to prevent.

By the time Git for Windows' `mingw-test-cmp` in C was finally
contributed to the Git mailing list, reviewers agreed that the diff
machinery had matured enough and should be used instead.

When the concern was raised that the diff machinery, due to its
complexity, would perform substantially worse than the test helper
originally implemented in the Git for Windows project, a test
demonstrated that these performance differences are well lost within the
100+ minutes it takes to run Git's test suite on Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 07:18:06 +09:00
2022-10-01 19:10:41 +08:00
2022-07-10 14:43:34 -07:00
2022-10-06 09:59:17 -07:00
2022-08-03 13:36:09 -07:00
2022-07-14 15:08:29 -07:00
2022-06-10 15:04:13 -07:00
2022-07-19 12:45:31 -07:00
2022-09-15 16:09:46 -07:00
2022-03-28 10:25:52 -07:00
2021-09-23 13:44:48 -07:00
2022-10-06 20:00:33 -04:00
2022-07-19 16:40:19 -07:00
2022-04-06 15:21:59 -07:00
2022-05-02 09:50:37 -07:00
2022-09-14 12:56:39 -07:00
2022-09-14 12:56:39 -07:00
2022-09-14 12:56:39 -07:00
2022-06-03 14:30:34 -07:00
2022-08-29 14:55:11 -07:00
2022-08-22 15:08:30 -07:00
2022-09-14 12:56:39 -07:00
2021-08-24 15:32:37 -07:00
2022-09-14 12:56:39 -07:00
2022-08-05 14:13:12 -07:00
2022-08-03 13:36:09 -07:00
2022-09-02 10:02:56 -07:00
2022-04-06 09:42:12 -07:00
2022-10-06 17:42:02 -04:00
2022-08-30 14:16:49 -07:00
2022-10-06 20:00:04 -04:00
2022-03-23 14:09:29 -07:00
2022-05-02 09:50:37 -07:00

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission and Documentation/CodingGuidelines).

Those wishing to help with error message, usage and informational message string translations (localization l10) should see po/README.md (a po file is a Portable Object file that holds the translations).

To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org (not the Git list). The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks
Description
No description provided
Readme 575 MiB
Languages
C 50.4%
Shell 38.7%
Perl 4.4%
Tcl 3.1%
Python 0.8%
Other 2.4%