Enable callers to generate reachability bitmaps when performing MIDX
layer compaction by combining all existing bitmaps from the compacted
layers.
Note that because of the object/pack ordering described by the previous
commit, the pseudo-pack order for the compacted MIDX is the same as
concatenating the individual pseudo-pack orderings for each layer in the
compaction range.
As a result, the only non-test or documentation change necessary is to
treat all objects as non-preferred during compaction so as not to
disturb the object ordering.
In the future, we may want to adjust which commit(s) receive
reachability bitmaps when compacting multiple .bitmap files into one, or
even generate new bitmaps (e.g., if the references have moved
significantly since the .bitmap was generated). This commit only
implements combining all existing bitmaps in range together in order to
demonstrate and lay the groundwork for more exotic strategies.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When managing a MIDX chain with many layers, it is convenient to combine
a sequence of adjacent layers into a single layer to prevent the chain
from growing too long.
While it is conceptually possible to "compact" a sequence of MIDX layers
together by running "git multi-pack-index write --stdin-packs", there
are a few drawbacks that make this less than desirable:
- Preserving the MIDX chain is impossible, since there is no way to
write a MIDX layer that contains objects or packs found in an earlier
MIDX layer already part of the chain. So callers would have to write
an entirely new (non-incremental) MIDX containing only the compacted
layers, discarding all other objects/packs from the MIDX.
- There is (currently) no way to write a MIDX layer outside of the MIDX
chain to work around the above, such that the MIDX chain could be
reassembled substituting the compacted layers with the MIDX that was
written.
- The `--stdin-packs` command-line option does not allow us to specify
the order of packs as they appear in the MIDX. Therefore, even if
there were workarounds for the previous two challenges, any bitmaps
belonging to layers which come after the compacted layer(s) would no
longer be valid.
This commit introduces a way to compact a sequence of adjacent MIDX
layers into a single layer while preserving the MIDX chain, as well as
any bitmap(s) in layers which are newer than the compacted ones.
Implementing MIDX compaction does not require a significant number of
changes to how MIDX layers are written. The main changes are as follows:
- Instead of calling `fill_packs_from_midx()`, we call a new function
`fill_packs_from_midx_range()`, which walks backwards along the
portion of the MIDX chain which we are compacting, and adds packs one
layer a time.
In order to preserve the pseudo-pack order, the concatenated pack
order is preserved, with the exception of preferred packs which are
always added first.
- After adding entries from the set of packs in the compaction range,
`compute_sorted_entries()` must adjust the `pack_int_id`'s for all
objects added in each fanout layer to match their original
`pack_int_id`'s (as opposed to the index at which each pack appears
in `ctx.info`).
Note that we cannot reuse `midx_fanout_add_midx_fanout()` directly
here, as it unconditionally recurs through the `->base_midx`. Factor
out a `_1()` variant that operates on a single layer, reimplement
the existing function in terms of it, and use the new variant from
`midx_fanout_add_compact()`.
Since we are sorting the list of objects ourselves, the order we add
them in does not matter.
- When writing out the new 'multi-pack-index-chain' file, discard any
layers in the compaction range, replacing them with the newly written
layer, instead of keeping them and placing the new layer at the end
of the chain.
This ends up being sufficient to implement MIDX compaction in such a way
that preserves bitmaps corresponding to more recent layers in the MIDX
chain.
The tests for MIDX compaction are so far fairly spartan, since the main
interesting behavior here is ensuring that the right packs/objects are
selected from each layer, and that the pack order is preserved despite
whether or not they are sorted in lexicographic order in the original
MIDX chain.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since c39fffc1c9 (tests: start asserting that *.txt SYNOPSIS matches -h
output, 2022-10-13), the manual page for 'git multi-pack-index' has a
SYNOPSIS section which differs from 'git multi-pack-index -h'.
Correct this while also documenting additional options accepted by the
'write' sub-command.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since fcb2205b77 (midx: implement support for writing incremental MIDX
chains, 2024-08-06), the command-line options '--incremental' and
'--bitmap' were declared to be incompatible with one another when
running 'git multi-pack-index write'.
However, since 27afc272c4 (midx: implement writing incremental MIDX
bitmaps, 2025-03-20), that incompatibility no longer exists, despite the
documentation saying so. Correct this by removing the stale reference to
their incompatibility.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All multi-pack-index sub-commands (write, verify, repack, and expire)
support a '--progress' command-line option, despite not listing it as
one of the common options in `common_opts`.
As a result each sub-command declares its own `OPT_BIT()` for a
"--progress" command-line option. Centralize this within the
`common_opts` to avoid re-declaring it in each sub-command.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For better searchability, this commit adds a check to ensure that parameters
expressed in the form of `--[no-]parameter` are not used in the
documentation. In the place of such parameters, the documentation should
list two separate parameters: `--parameter` and `--no-parameter`.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify what happens when an object exists in more than one pack, but
not in the preferred pack. "git multi-pack-index repack" relies on ties
for objects that are not in the preferred pack being resolved in favor
of the newest pack that contains a copy of the object. If ties were
resolved in favor of the oldest pack as the current documentation
suggests the multi-pack index would not reference any of the objects in
the pack created by "git multi-pack-index repack".
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We presently use the ".txt" extension for our AsciiDoc files. While not
wrong, most editors do not associate this extension with AsciiDoc,
meaning that contributors don't get automatic editor functionality that
could be useful, such as syntax highlighting and prose linting.
It is much more common to use the ".adoc" extension for AsciiDoc files,
since this helps editors automatically detect files and also allows
various forges to provide rich (HTML-like) rendering. Let's do that
here, renaming all of the files and updating the includes where
relevant. Adjust the various build scripts and makefiles to use the new
extension as well.
Note that this should not result in any user-visible changes to the
documentation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>