Commit Graph

75368 Commits

Author SHA1 Message Date
Patrick Steinhardt
6014639837 reftable/generic: drop interface
The `reftable_table` interface provides a generic infrastructure that
can abstract away whether the underlying table is a single table, or a
merged table. This abstraction can make it rather hard to reason about
the code. We didn't ever use it to implement the reftable backend, and
with the preceding patches in this patch series we in fact don't use it
at all anymore. Furthermore, it became somewhat useless with the recent
refactorings that made it possible to seek reftable iterators multiple
times, as these now provide generic access to tables for us. The
interface is thus redundant and only brings unnecessary complexity with
it.

Remove the `struct reftable_table` interface and its associated
functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
Patrick Steinhardt
89191232b8 t/helper: refactor to not use struct reftable_table
The `struct reftable_table` interface in our "reftable" test helper gets
used such that we can easily print either a single table, or a merged
stack. This generic interface is about to go away.

Prepare the code for this change by using merged tables instead. When
printing the stack we've already got one. When using a single table, we
can create a merged table from it to adapt.

This removes the last user of the generic interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
Patrick Steinhardt
1f39dd2ae5 t/helper: use hash_to_hex_algop() to print hashes
The "reftable" test helper uses a hand-crafted version to convert from a
raw hash to its hex variant. This was done because this code used to be
part of the reftable library, where we do not use most functions from
the Git core.

Now that the code is integrated into the "dump-reftable" helper though,
that limitation went away. Let's thus use `hash_to_hex_algop()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
Patrick Steinhardt
42c424d69d t/helper: inline printing of reftable records
Move printing of reftable records into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt
64a5b7a8ca t/helper: inline reftable_table_print()
Move `reftable_table_print()` into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt
ca74ef6ffb t/helper: inline reftable_stack_print_directory()
Move `reftable_stack_print_directory()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.

Note that this requires us to remove the tests for this functionality in
`reftable/stack_test.c`. The test does not really add much anyway,
because all it verifies is that we do not crash or run into an error,
and it specifically doesn't check the outputted data. Also, as the code
is now part of the test helper, it doesn't make much sense to have a
unit test for it in the first place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt
22f519a9a0 t/helper: inline reftable_reader_print_file()
Move `reftable_reader_print_file()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt
2b06b28fd6 t/helper: inline reftable_dump_main()
The printing functionality part of `reftable/dump.c` is really only used
by our "dump-reftable" test helper. It is certainly not generic logic
that is useful to anybody outside of Git, and the format it generates is
quite specific. Still, parts of it are used in our test suite and the
output may be useful to take a peek into reftable stacks, tables and
blocks. So while it does not make sense to expose this as part of the
reftable library, it does make sense to keep it around.

Inline the `reftable_dump_main()` function into the "dump-reftable" test
helper. This clarifies that its format is subject to change and not part
of our public interface. Furthermore, this allows us to iterate on the
implementation in subsequent patches.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt
55c7ff42f9 reftable/dump: drop unused compact_stack()
The `compact_stack()` function is exposed via `reftable_dump_main()`,
which ultimately ends up being wired into "test-tool reftable". It is
never used by our tests though, and nowadays we have wired up support
for stack compaction into git-pack-refs(1).

Remove the code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
Patrick Steinhardt
f2406c81b9 reftable/generic: move generic iterator code into iterator interface
Move functions relating to the reftable iterator from "generic.c" into
"iter.c". This prepares for the removal of the former subsystem.

While at it, remove some unneeded braces to conform to our coding style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
Patrick Steinhardt
b34ce7e911 reftable/iter: drop double-checking logic
The filtering ref iterator can be used to only yield refs which are not
in a specific skip list. This iterator has an option to double-check the
results it returns, which causes us to seek the reference we are about
to yield via a separate table such that we detect whether the reference
that the first iterator has yielded actually exists.

The value of this is somewhat dubious, and I cannot think of any usecase
where this functionality should be required. Furthermore, this option is
never set in our codebase, which means that it is essentially untested.
And last but not least, the `struct reftable_table` that is used to
implement it is about to go away.

So while we could refactor the code to not use a `reftable_table`, it
very much feels like a wasted effort. Let's just drop this code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
Patrick Steinhardt
aef8602653 reftable/stack: open-code reading refs
To read a reference for the reftable stack, we first create a generic
`reftable_table` from the merged table and then read the reference via a
convenience function. We are about to remove these generic interfaces,
so let's instead open-code the logic to prepare for this removal.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
Patrick Steinhardt
b8ca235ca5 reftable/merged: stop using generic tables in the merged table
The merged table provides access to a reftable stack by merging the
contents of those tables into a virtual table. These subtables are being
tracked via `struct reftable_table`, which is a generic interface for
accessing either a single reftable or a merged reftable. So in theory,
it would be possible for the merged table to merge together other merged
tables.

This is somewhat nonsensical though: we only ever set up a merged table
over normal reftables, and there is no reason to do otherwise. This
generic interface thus makes the code way harder to follow and reason
about than really necessary. The abstraction layer may also have an
impact on performance, even though the extra set of vtable function
calls probably doesn't really matter.

Refactor the merged tables to use a `struct reftable_reader` for each of
the subtables instead, which gives us direct access to the underlying
tables. Adjust names accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
Patrick Steinhardt
6631ed3ce7 reftable/merged: rename reftable_new_merged_table()
Rename `reftable_new_merged_table()` to `reftable_merged_table_new()`
such that the name matches our coding style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
Patrick Steinhardt
987762a51a reftable/merged: expose functions to initialize iterators
We do not expose any functions via our public headers that would allow a
caller to initialize a reftable iterator from a merged table. Instead,
they are expected to go via the generic `reftable_table` interface,
which is somewhat roundabout.

Implement two new functions to initialize iterators for ref and log
records to plug this gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:45 -07:00
Junio C Hamano
3a7362eb9f The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 12:02:25 -07:00
Junio C Hamano
74f94f27a9 Merge branch 'jc/how-to-maintain-updates'
Doc updates.

* jc/how-to-maintain-updates:
  howto-maintain: mention preformatted docs
2024-08-21 12:02:25 -07:00
Junio C Hamano
eb630683c2 Merge branch 'jk/apply-patch-mode-check-fix'
Test fix.

* jk/apply-patch-mode-check-fix:
  t4129: fix racy index when calling chmod after git-add
2024-08-21 12:02:25 -07:00
Junio C Hamano
b772c9cf2e Merge branch 'ps/bundle-outside-repo-fix'
"git bundle unbundle" outside a repository triggered a BUG()
unnecessarily, which has been corrected.

* ps/bundle-outside-repo-fix:
  bundle: default to SHA1 when reading bundle headers
  builtin/bundle: have unbundle check for repo before opening its bundle
2024-08-21 12:02:24 -07:00
Junio C Hamano
fdf70da8c3 Merge branch 'jc/grammo-fixes'
Doc updates.

* jc/grammo-fixes:
  doc: grammofix in git-diff-tree
  tutorial: grammofix
2024-08-21 12:02:24 -07:00
Junio C Hamano
d97956b8bd Merge branch 'ag/git-svn-global-ignores'
"git svn" has been taught about svn:global-ignores property
recent versions of Subversion has.

* ag/git-svn-global-ignores:
  git-svn: mention `svn:global-ignores` in help+docs
  git-svn: use `svn:global-ignores` to create .gitignore
  git-svn: add public property `svn:global-ignores`
2024-08-21 12:02:23 -07:00
Patrick Steinhardt
8311e3b551 builtin/maintenance: fix loose objects task emitting pack hash
The "loose-objects" maintenance tasks executes git-pack-objects(1) to
pack all loose objects into a new packfile. This command ends up
printing the hash of the packfile to stdout though, which clutters the
output of `git maintenance run`.

Fix this issue by disabling stdout of the child process.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 11:33:22 -07:00
Patrick Steinhardt
51a0b8a2a7 t7900: exercise detaching via trace2 regions
In t7900, we exercise the `--detach` logic by checking whether the
command ended up writing anything to its output or not. This supposedly
works because we close stdin, stdout and stderr when daemonizing. But
one, it breaks on platforms where daemonize is a no-op, like Windows.
And second, that git-maintenance(1) outputs anything at all in these
tests is a bug in the first place that we'll fix in a subsequent commit.

Introduce a new trace2 region around the detach which allows us to more
explicitly check whether the detaching logic was executed. This is a
much more direct way to exercise the logic, provides a potentially
useful signal to tracing logs and also works alright on platforms which
do not have the ability to daemonize.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
[jc: dropped a stale in-code comment from a test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 11:33:02 -07:00
Chandra Pratap
772408fe75 t-reftable-block: add tests for index blocks
In the current testing setup, block operations are left unexercised
for index blocks. Add a test that exercises these operations for
index blocks.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap
1528c481d7 t-reftable-block: add tests for obj blocks
In the current testing setup, block operations are left unexercised
for obj blocks. Add a test that exercises these operations for obj
blocks.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap
5cba56173b t-reftable-block: add tests for log blocks
In the current testing setup, block operations are only exercised
for ref blocks. Add another test that exercises these operations
for log blocks as well.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap
abcddcef3d t-reftable-block: remove unnecessary variable 'j'
Currently, there are two variables for array indices, 'i' and 'j'.
The variable 'j' is used only once and can be easily replaced with
'i'. Get rid of 'j' and replace its occurence with 'i'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap
29ee6d5a20 t-reftable-block: use xstrfmt() instead of xstrdup()
Use xstrfmt() to assign a formatted string to a ref record's
refname instead of xstrdup(). This helps save the overhead of
a local 'char' buffer as well as makes the test more compact.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap
31216ee28a t-reftable-block: use block_iter_reset() instead of block_iter_close()
block_iter_reset() restores a block iterator to its state at the time
of initialization without freeing any memory while block_iter_close()
deallocates the memory for the iterator.

In the current testing setup, a block iterator is allocated and
deallocated for every iteration of a loop, which hurts performance.
Improve upon this by using block_iter_reset() at the start of each
iteration instead. This has the added benifit of testing
block_iter_reset(), which currently remains untested.

Similarly, remove reftable_record_release() for a reftable record
that is still in use.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap
c25cbcd352 t-reftable-block: use reftable_record_key() instead of strbuf_addstr()
In the current testing setup, the record key required for many block
iterator functions is manually stored in a strbuf struct and then
passed to these functions. This is not ideal when there exists a
dedicated function to encode a record's key into a strbuf, namely
reftable_record_key(). Use this function instead of manual encoding.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Chandra Pratap
e638e9c8f3 t-reftable-block: use reftable_record_equal() instead of check_str()
In the current testing setup, operations like read and write for
reftable blocks as defined by reftable/block.{c, h} are verified by
comparing only the keys of input and output reftable records. This is
not ideal because there can exist inequal reftable records with the
same key. Use the dedicated function for record comparison,
reftable_record_equal(), instead of key-based comparison.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Chandra Pratap
353672f9f8 t-reftable-block: release used block reader
Used block readers must be released using block_reader_release() to
prevent the occurence of a memory leak. Make test_block_read_write()
conform to this statement.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Chandra Pratap
6853b931bd t: harmonize t-reftable-block.c with coding guidelines
Harmonize the newly ported test unit-tests/t-reftable-block.c
with the following guidelines:
- Single line 'for' statements must omit curly braces.
- Structs must be 0-initialized with '= { 0 }' instead of '= { NULL }'.
- Array sizes and indices should preferably be of type 'size_t'and
  not 'int'.
- Return code variable should preferably be named 'ret', not 'n'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Chandra Pratap
546cc0d64e t: move reftable/block_test.c to the unit testing framework
reftable/block_test.c exercises the functions defined in
reftable/block.{c, h}. Migrate reftable/block_test.c to the unit
testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests to follow the unit-tests'
naming conventions.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Matheus Tavares
4bdd6b7bf2 rebase --exec: respect --quiet
rebase --exec doesn't obey --quiet and ends up printing messages about
the command being executed:

  git rebase HEAD~3 --quiet --exec true
  Executing: true
  Executing: true
  Executing: true

Let's fix that by omitting the "Executing" messages when using --quiet.

Furthermore, the sequencer code includes a few calls to
term_clear_line(), which prints a special character sequence to erase
the previous line displayed on stderr (even when nothing was printed
yet). For an user running the command interactively, the net effect of
calling this function with or without --quiet is the same as the
characters are invisible in the terminal. However, when redirecting the
output to a file or piping to another command, the presence of these
invisible characters is noticeable, and it may break user expectation as
--quiet is not being respected.

We could skip the term_clear_line() calls when --quiet is used, like we
are doing with the "Executing" messages, but it makes much more sense to
condition the line cleaning upon stderr being TTY, since these
characters are really only useful for TTY outputs.

The added test checks for both these two changes.

Reported-by: Lincoln Yuji <lincolnyuji@hotmail.com>
Reported-by: Rodrigo Siqueira <siqueirajordao@riseup.net>
Signed-off-by: Matheus Tavares <matheus.tavb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 08:57:51 -07:00
Junio C Hamano
80ccd8a260 Sync with 'maint' for Windows+VS build jobs used at CI 2024-08-20 14:24:57 -07:00
Junio C Hamano
870e227a67 Merge branch 'jk/midx-unused-fix'
Code clean-up in the base topic.

* jk/midx-unused-fix:
  midx: drop unused parameters from add_midx_to_chain()
2024-08-20 14:23:46 -07:00
Junio C Hamano
6a562e68a3 Merge branch 'js/ci-win-vs-build' into maint-2.46
Sync with Windows+VS build jobs used at CI.

* js/ci-win-vs-build:
  ci(win+VS): download the vcpkg artifacts using a dedicated GitHub Action
  ci: bump microsoft/setup-msbuild from v1 to v2
2024-08-20 14:23:12 -07:00
Junio C Hamano
be10ac7037 mailinfo: we parse fixed headers
The code was written as if we have a small room to add additional
headers to be parsed to the header[] array at runtime, but that is
not our intention at all.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 14:20:58 -07:00
Junio C Hamano
44db6f75cc CodingGuidelines: spaces around C operators
As we have operated with "write like how your surrounding code is
written" for too long, after a huge code drop from another project,
we'll end up being inconsistent before such an imported code is
cleaned up.  We have many uses of cast operator with a space before
its operand, mostly in the reftable code.

Spell the convention out before it spreads to other places.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 14:10:10 -07:00
Junio C Hamano
2df380c280 Merge branch 'ps/leakfixes-part-4' into ps/leakfixes-part-5
* ps/leakfixes-part-4: (22 commits)
  builtin/diff: free symmetric diff members
  diff: free state populated via options
  builtin/log: fix leak when showing converted blob contents
  userdiff: fix leaking memory for configured diff drivers
  builtin/format-patch: fix various trivial memory leaks
  diff: fix leak when parsing invalid ignore regex option
  unpack-trees: clear index when not propagating it
  sequencer: release todo list on error paths
  merge-ort: unconditionally release attributes index
  builtin/fast-export: plug leaking tag names
  builtin/fast-export: fix leaking diff options
  builtin/fast-import: plug trivial memory leaks
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/rebase: fix leaking `commit.gpgsign` value
  config: fix leaking comment character config
  submodule-config: fix leaking name entry when traversing submodules
  read-cache: fix leaking hashfile when writing index fails
  bulk-checkin: fix leaking state TODO
  object-name: fix leaking symlink paths in object context
  object-file: fix memory leak when reading corrupted headers
  ...
2024-08-20 10:15:27 -07:00
Ghanshyam Thakkar
05026637f3 t: migrate t0110-urlmatch-normalization to the new framework
helper/test-urlmatch-normalization along with
t0110-urlmatch-normalization test the `url_normalize()` function from
'urlmatch.h'. Migrate them to the unit testing framework for better
performance. And also add different test_msg()s for better debugging.

In the migration, last two of the checks from `t_url_general_escape()`
were slightly changed compared to the shell script. This involves
changing

'\'' -> '
'\!' -> !

in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
pasted verbatim from the shell script.

Another change is the removal of a MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a
Unicode string, it is not possible to pass arbitrary bytes to a
program. But in unit tests we don't have this limitation.

And since we can construct strings with arbitrary bytes in C, let's
also remove the test files which contain URLs with arbitrary bytes in
the 't/t0110' directory and instead embed those URLs in the unit test
code itself.

[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 10:08:28 -07:00
Jeff King
a6bcb3ca01 t-hashmap: stop calling setup() for t_intern() test
Commit f24a9b78a9 (t-hashmap: mark unused parameters in callback
function, 2024-08-17) noted that the t_intern() does not need its
hashmap parameter, but we have to keep it to conform to the function
pointer interface of setup().

But since the only thing setup() does is create and tear down the
hashmap, we can just skip calling setup() entirely for this case, and
drop the unused parameters. This simplifies the code a bit.

Helped-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:33:18 -07:00
Avi Halachmi (:avih)
fbcdfab348 git-prompt: support custom 0-width PS1 markers
When using colors, the shell needs to identify 0-width substrings
in PS1 - such as color escape sequences - when calculating the
on-screen width of the prompt.

Until now, we used the form %F{<color>} in zsh - which it knows is
0-width, or otherwise use standard SGR esc sequences wrapped between
byte values 1 and 2 (SOH, STX) as 0-width start/end markers, which
bash/readline identify as such.

But now that more shells are supported, the standard SGR sequences
typically work, but the SOH/STX markers might not be identified.

This commit adds support for vars GIT_PS1_COLOR_{PRE,POST} which
set custom 0-width markers or disable the markers.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:19 -07:00
Avi Halachmi (:avih)
0dbe3d3f16 git-prompt: ta-da! document usage in other shells
With one big exception, git-prompt.sh should now be both almost posix
compliant, and also compatible with most (posix-ish) shells.

That exception is the use of "local" vars in functions, which happens
extensively in the current code, and is not simple to replace with
posix compliant code (but also not impossible).

Luckily, almost all shells support "local" as used by the current
code, with the notable exception of ksh93[u+m], but also the Schily
minimal posix sh (pbosh), and yash in posix mode.

See assessment below that "local" is likely the only blocker in those.

So except mainly ksh93, git-prompt.sh now works in most shells:
- bash, zsh, dash since at least 0.5.8, free/net bsd sh, busybox-ash,
  mksh, openbsd sh, pdksh(!), Schily extended Bourne sh (bosh), yash.

which is quite nice.

As an anecdote, replacing the 1st line in __git_ps1() (local exit=$?)
with these 2 makes it work in all tested shells, even without "local":

  # handles only 0/1 args for simplicity. needs +5 LOC for any $#
  __git_e=$?; local exit="$__git_e" 2>/dev/null ||
    {(eval 'local() { export "$@"; }'; __git_ps1 "$@"); return "$__git_e"; }

Explanation:

  If the shell doesn't have the command "local", define our own
  function "local" which instead does plain (global) assignents.
  Then use __git_ps1 in a subshell to not clober the caller's vars.

  This happens to work because currently there are no name conflicts
  (shadow) at the code, initial value is not assumed (i.e. always
  doing either 'local x=...'  or 'local x;...  x=...'), and assigned
  initial values are quoted (local x="$y"), preventing word split and
  glob expansion (i.e. assignment context is not assumed).

  The last two (always init, quote values) seem to be enough to use
  "local" portably if supported, and otherwise shells indeed differ.

  Uses "eval", else shells with "local" may reject it during parsing.
  We don't need "export", but it's smaller than writing our own loop.

While cute, this approach is not really sustainable because all the
vars become global, which is hard to maintain without conflicts
(but hey, it currently has no conflicts - without even trying...).

However, regardless of being an anecdote, it provides some support to
the assessment that "local" is the only blocker in those shells.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:19 -07:00
Avi Halachmi (:avih)
29bcec82a6 git-prompt: don't use shell $'...'
$'...' is new in POSIX (2024), and some shells support it in recent
versions, while others have had it for decades (bash, zsh, ksh93).

However, there are still enough shells which don't support it, and
it's cheap to use an alternative form which works in all shells,
so let's do that instead of dismissing it as "it's compliant".

It was agreed to use one form rather than $'...' where supported and
fallback otherwise.

shells where $'...' works:
- bash, zsh, ksh93, mksh, busybox-ash, dash master, free/net bsd sh.

shells where it doesn't work, but the new fallback works:
- all dash releases (up to 0.5.12), older versions of free/net bsd sh,
  openbsd sh, pdksh, all Schily Bourne sh variants, yash.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:18 -07:00
Avi Halachmi (:avih)
b732e08671 git-prompt: add some missing quotes
The issues which this commit fixes are unlikely to be broken
in real life, but the fixes improve correctness, and would prevent
bugs in some uncommon cases, such as weird IFS values.

Listing some portability guidelines here for future reference.

I'm leaving it to someone else to decide whether to include
it in the file itself, place it as a new file, or not.

---------

The command "local" is non standard, but is allowed in this file:
- Quote initialization if it can expand (local x="$y"). See below.
- Don't assume initial value after "local x". Either initialize it
  (local x=..), or set before first use (local x;.. x=..; <use $x>).
  (between shells, "local x" can unset x, or inherit it, or do x= )

Other non-standard features beyond "local" are to be avoided.

Use the standard "test" - [...] instead of non-standard [[...]] .

--------

Quotes (some portability things, but mainly general correctness):

Quotes prevent tilde-expansion of some unquoted literal tildes (~).
If the expansion is undesirable, quotes would ensure that.
  Tilds expanded: a=~user:~/ ;  echo ~user ~/dir
  not expanded:   t="~"; a=${t}user  b=\~foo~;  echo "~user" $t/dir

But the main reason for quoting is to prevent IFS field splitting
(which also coalesces IFS chars) and glob expansion in parts which
contain parameter/arithmetic expansion or command substitution.

"Simple command" (POSIX term) is assignment[s] and/or command [args].
Examples:
  foo=bar         # one assignment
  foo=$bar x=y    # two assignments
  foo bar         # command, no assignments
  x=123 foo bar   # one assignment and a command

The assignments part is not IFS-split or glob-expanded.

The command+args part does get IFS field split and glob expanded,
but only at unquoted expanded/substituted parts.

In the command+args part, expanded/substituted values must be quoted.
(the commands here are "[" and "local"):
  Good: [ "$mode" = yes ]; local s="*" x="$y" e="$?" z="$(cmd ...)"
  Bad:  [ $mode = yes ];   local s=*   x=$y   e=$?   z=$(cmd...)

The arguments to "local" do look like assignments, but they're not
the assignment part of a simple command; they're at the command part.

Still at the command part, no need to quote non-expandable values:
  Good:                 local x=   y=yes;   echo OK
  OK, but not required: local x="" y="yes"; echo "OK"
But completely empty (NULL) arguments must be quoted:
  foo ""   is not the same as:   foo

Assignments in simple commands - with or without an actual command,
don't need quoting becase there's no IFS split or glob expansion:
  Good:   s=* a=$b c=$(cmd...)${x# foo }${y-   } [cmd ...]
  It's also OK to use double quotes, but not required.

This behavior (no IFS/glob) is called "assignment context", and
"local" does not behave with assignment context in some shells,
hence we require quotes when using "local" - for compatibility.

The value between 'case' and 'in' doesn't IFS-split/glob-expand:
  Good:       case  * $foo $(cmd...)  in ... ; esac
  identical:  case "* $foo $(cmd...)" in ... ; esac

Nested quotes in command substitution are fine, often necessary:
  Good: echo "$(foo... "$x" "$(bar ...)")"

Nested quotes in substring ops are legal, and sometimes needed
to prevent interpretation as a pattern, but not the most readable:
  Legal:  foo "${x#*"$y" }"

Nested quotes in "maybe other value" subst are invalid, unnecessary:
  Good:  local x="${y- }";   foo "${z:+ $a }"
  Bad:   local x="${y-" "}"; foo "${z:+" $a "}"
Outer/inner quotes in "maybe other value" have different use cases:
  "${x-$y}"  always one quoted arg: "$x" if x is set, else "$y".
  ${x+"$x"}  one quoted arg "$x" if x is set, else no arg at all.
  Unquoted $x is similar to the second case, but it would get split
  into few arguments if it includes any of the IFS chars.

Assignments don't need the outer quotes, and the braces delimit the
value, so nested quotes can be avoided, for readability:
  a=$(foo "$x")  a=${x#*"$y" }  c=${y- };  bar "$a" "$b" "$c"

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:18 -07:00
Avi Halachmi (:avih)
fe445a1026 git-prompt: replace [[...]] with standard code
The existing [[...]] tests were either already valid as standard [...]
tests, or only required minimal retouch:

Notes:

- [[...]] doesn't do field splitting and glob expansion, so $var
  or $(cmd...) don't need quoting, but [... does need quotes.

- [[ X == Y ]] when Y is a string is same as [ X = Y ], but if Y is
  a pattern, then we need:  case X in Y)... ; esac  .

- [[ ... && ... ]] was replaced with [ ... ] && [ ... ] .

- [[ -o <zsh-option> ]] requires [[...]], so put it in "eval" and only
  eval it in zsh, so other shells would not abort on syntax error
  (posix says [[ has unspecified results, shells allowed to reject it)

- ((x++)) was changed into x=$((x+1))  (yeah, not [[...]] ...)

Shells which accepted the previous forms:
- bash, zsh, ksh93, mksh, openbsd sh, pdksh.

Shells which didn't, and now can process it:
- dash, free/net bsd sh, busybox-ash, Schily Bourne sh, yash.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:18 -07:00
Avi Halachmi (:avih)
f2e264e43f git-prompt: don't use shell arrays
Arrays only existed in the svn-upstream code, used to:
- Keep a list of svn remotes.
- Convert commit msg to array of words, extract the 2nd-to-last word.

Except bash/zsh, nearly all shells failed load on syntax errors here.

Now:
- The svn remotes are a list of newline-terminated values.
- The 2nd-to-last word is extracted using standard shell substrings.
- All shells can digest the svn-upstream code.

While using shell field splitting to extract the word is simple, and
doesn't even need non-standard code, e.g. set -- $(git log -1 ...),
it would have the same issues as the old array code: it depends on IFS
which we don't control, and it's subject to glob-expansion, e.g. if
the message happens to include * or **/* (as this commit message just
did), then the array could get huge. This was not great.

Now it uses standard shell substrings, and we know the exact delimiter
to expect, because it's the match from our grep just one line earlier.

The new word extraction code also fixes svn-upstream in zsh, because
previously it used arr[len-2], but because in zsh, unlike bash, array
subscripts are 1-based, it incorrectly extracted the 3rd-to-last word.
symptom: missing upstream status in a git-svn repo: u=, u+N-M, etc.

The breakage in zsh is surprising, because it was last touched by
  commit d0583da838 (prompt: fix show upstream with svn and zsh),
claiming to fix exactly that. However, it only mentions syntax fixes.
It's unclear if behavior was fixed too. But it was broken, now fixed.

Note LF=$'\n' and then using $LF instead of $'\n' few times.
A future commit will add fallback for shells without $'...', so this
would be the only line to touch instead of replacing every $'\n' .

Shells which could run the previous array code:
- bash

Shells which have arrays but were broken anyway:
- zsh: 1-based subscript
- ksh93: no "local" (the new code can't fix this part...)
- mksh, openbsd sh, pdksh: failed load on syntax error: "for ((...))".

More shells which Failed to load due to syntax error:
- dash, free/net bsd sh, busybox-ash, Schily Bourne shell, yash.

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:18 -07:00
Avi Halachmi (:avih)
6df4b09159 git-prompt: fix uninitialized variable
First use is in the form:  local var; ...; var=$var$whatever...

If the variable was unset (as bash and others do after "local x"),
then it would error if set -u is in effect.

Also, many shells inherit the existing value after "local var"
without init, but in this case it's unlikely to have a prior value.

Now we initialize it.

(local var= is enough, but local var="" is the custom in this file)

Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:28:17 -07:00