Further update to the i18n alias support to avoid regressions.
* jh/alias-i18n-fixes:
git, help: fix memory leaks in alias listing
alias: treat empty subsection [alias ""] as plain [alias]
doc: fix list continuation in alias subsection example
"git add <submodule>" has been taught to honor
submodule.<name>.ignore that is set to "all" (and requires "git add
-f" to override it).
* cs/add-skip-submodule-ignore-all:
Documentation: update add --force option + ignore=all config
tests: fix existing tests when add an ignore=all submodule
tests: t2206-add-submodule-ignored: ignore=all and add --force tests
read-cache: submodule add need --force given ignore=all configuration
read-cache: update add_files_to_cache take param ignored_too
Allow hook commands to be defined (possibly centrally) in the
configuration files, and run multiple of them for the same hook
event.
* ar/config-hooks:
hook: add -z option to "git hook list"
hook: allow out-of-repo 'git hook' invocations
hook: allow event = "" to overwrite previous values
hook: allow disabling config hooks
hook: include hooks from the config
hook: add "git hook list" command
hook: run a list of hooks to prepare for multihook support
hook: add internal state alloc/free callbacks
The example showing the equivalence between alias.last and
alias.last.command was missing the list continuation marks (+
between the shell session block and the following prose, leaving
the paragraph detached from the list item in the rendered output.
Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git maintenance" starts using the "geometric" strategy by default.
* ps/maintenance-geometric-default:
builtin/maintenance: use "geometric" strategy by default
t7900: prepare for switch of the default strategy
t6500: explicitly use "gc" strategy
t5510: explicitly use "gc" strategy
t5400: explicitly use "gc" strategy
t34xx: don't expire reflogs where it matters
t: disable maintenance where we verify object database structure
t: fix races caused by background maintenance
"git status" learned to show comparison between the current branch
and various other branches listed on status.compareBranches
configuration.
* hn/status-compare-with-push:
status: add status.compareBranches config for multiple branch comparisons
refactor format_branch_comparison in preparation
Allow the directory in which reference backends store their data to
be specified.
* kn/ref-location:
refs: add GIT_REFERENCE_BACKEND to specify reference backend
refs: allow reference location in refstorage config
refs: receive and use the reference storage payload
refs: move out stub modification to generic layer
refs: extract out `refs_create_refdir_stubs()`
setup: don't modify repo in `create_reference_database()`
Add a new configuration variable status.compareBranches that allows
users to specify a space-separated list of branch comparisons in
git status output.
Supported values:
- @{upstream} for the current branch's upstream tracking branch
- @{push} for the current branch's push destination
Any other value is ignored and a warning is shown.
When not configured, the default behavior is equivalent to setting
`status.compareBranches = @{upstream}`, preserving backward
compatibility.
The advice messages shown are context-aware:
- "git pull" advice is shown only when comparing against @{upstream}
- "git push" advice is shown only when comparing against @{push}
- Divergence advice is shown for upstream branch comparisons
This is useful for triangular workflows where the upstream tracking
branch differs from the push destination, allowing users to see their
status relative to both branches at once.
Example configuration:
[status]
compareBranches = @{upstream} @{push}
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'extensions.refStorage' config is used to specify the reference
backend for a given repository. Both the 'files' and 'reftable' backends
utilize the $GIT_DIR as the reference folder by default in
`get_main_ref_store()`.
Since the reference backends are pluggable, this means that they could
work with out-of-tree reference directories too. Extend the 'refStorage'
config to also support taking an URI input, where users can specify the
reference backend and the location.
Add the required changes to obtain and propagate this value to the
individual backends. Add the necessary documentation and tests.
Traditionally, for linked worktrees, references were stored in the
'$GIT_DIR/worktrees/<wt_id>' path. But when using an alternate reference
storage path, it doesn't make sense to store the main worktree
references in the new path, and the linked worktree references in the
$GIT_DIR. So, let's store linked worktree references in
'$ALTERNATE_REFERENCE_DIR/worktrees/<wt_id>'. To do this, create the
necessary files and folders while also adding stubs in the $GIT_DIR path
to ensure that it is still considered a Git directory.
Ideally, we would want to pass in a `struct worktree *` to individual
backends, instead of passing the `gitdir`. This allows them to handle
worktree specific logic. Currently, that is not possible since the
worktree code is:
- Tied to using the global `the_repository` variable.
- Is not setup before the reference database during initialization of
the repository.
Add a TODO in 'refs.c' to ensure we can eventually make that change.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-gc(1) command has been introduced in the early days of Git in
30f610b7b0 (Create 'git gc' to perform common maintenance operations.,
2006-12-27) as the main repository maintenance utility. And while the
tool has of course evolved since then to cover new parts, the basic
strategy it uses has never really changed much.
It is safe to say that since 2006 the Git ecosystem has changed quite a
bit. Repositories tend to be much larger nowadays than they have been
almost 20 years ago, and large parts of the industry went crazy for
monorepos (for various wildly different definitions of "monorepo"). So
the maintenance strategy we used back then may not be the best fit
nowadays anymore.
Arguably, most of the maintenance tasks that git-gc(1) does are still
perfectly fine today: repacking references, expiring various data
structures and things like tend to not cause huge problems. But the big
exception is the way we repack objects.
git-gc(1) by default uses a split strategy: it performs incremental
repacks by default, and then whenever we have too many packs we perform
a large all-into-one repack. This all-into-one repack is what is causing
problems nowadays, as it is an operation that is quite expensive. While
it is wasteful in small- and medium-sized repositories, in large repos
it may even be prohibitively expensive.
We have eventually introduced git-maintenance(1) that was slated as a
replacement for git-gc(1). In contrast to git-gc(1), it is much more
flexible as it is structured around configurable tasks and strategies.
So while its default "gc" strategy still uses git-gc(1) under the hood,
it allows us to iterate.
A second strategy it knows about is the "incremental" strategy, which we
configure when registering a repository for scheduled maintenance. This
strategy isn't really a full replacement for git-gc(1) though, as it
doesn't know to expire unused data structures. In Git 2.52 we have thus
introduced a new "geometric" strategy that is a proper replacement for
the old git-gc(1).
In contrast to the incremental/all-into-one split used by git-gc(1), the
new "geometric" strategy maintains a geometric progression of packfiles,
which significantly reduces the number of all-into-one repacks that we
have to perform in large repositories. It is thus a much better fit for
large repositories than git-gc(1).
Note that the "geometric" strategy isn't perfect though: while we
perform way less all-into-one repacks compared to git-gc(1), we still
have to perform them eventually. But for the largest repositories out
there this may not be an option either, as client machines might not be
powerful enough to perform such a repack in the first place. These cases
would thus still be covered by the "incremental" strategy.
Switch the default strategy away from "gc" to "geometric", but retain
the "incremental" strategy configured when registering background
maintenance with `git maintenance register`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the alias configuration syntax to allow aliases using
characters outside ASCII alphanumeric (plus '-').
* jh/alias-i18n:
completion: fix zsh alias listing for subsection aliases
alias: support non-alphanumeric names via subsection syntax
alias: prepare for subsection aliases
help: use list_aliases() for alias listing
A handful of places used refs_for_each_ref_in() API incorrectly,
which has been corrected.
* ps/for-each-ref-in-fixes:
bisect: simplify string_list memory handling
bisect: fix misuse of `refs_for_each_ref_in()`
pack-bitmap: fix bug with exact ref match in "pack.preferBitmapTips"
pack-bitmap: deduplicate logic to iterate over preferred bitmap tips
Add the ability for empty events to clear previously set multivalue
variables, so the newly added "hook.*.event" behave like the other
multivalued keys.
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hooks specified via configs are always enabled, however users
might want to disable them without removing from the config,
like locally disabling a global hook.
Add a hook.<name>.enabled config which defaults to true and
can be optionally set for each configured hook.
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the hook.[hc] library to parse configs to populate the list of
hooks to run for a given event.
Multiple commands can be specified for a given hook by providing
"hook.<friendly-name>.command = <path-to-hook>" and
"hook.<friendly-name>.event = <hook-event>" lines.
Hooks will be started in config order of the "hook.<name>.event"
lines and will be run sequentially (.jobs == 1) like before.
Running the hooks in parallel will be enabled in a future patch.
The "traditional" hook from the hookdir is run last, if present.
A strmap cache is added to struct repository to avoid re-reading
the configs on each rook run. This is useful for hooks like the
ref-transaction which gets executed multiple times per process.
Examples:
$ git config --get-regexp "^hook\."
hook.bar.command=~/bar.sh
hook.bar.event=pre-commit
# Will run ~/bar.sh, then .git/hooks/pre-commit
$ git hook run pre-commit
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "pack.preferBitmapTips" configuration allows the user to specify
which references should be preferred when generating bitmaps. This
option is typically expected to be set to a reference prefix, like for
example "refs/heads/".
It's not unreasonable though for a user to configure one specific
reference as preferred. But if they do, they'll hit a `BUG()`:
$ git -c pack.preferBitmapTips=refs/heads/main repack -adb
BUG: ../refs/iterator.c:366: attempt to trim too many characters
error: pack-objects died of signal 6
The root cause for this bug is how we enumerate these references. We
call `refs_for_each_ref_in()`, which will:
- Yield all references that have a user-specified prefix.
- Trim each of these references so that the prefix is removed.
Typically, this function is called with a trailing slash, like
"refs/heads/", and in that case things work alright. But if the function
is called with the name of an existing reference then we'll try to trim
the full reference name, which would leave us with an empty name. And as
this would not really leave us with anything sensible, we call `BUG()`
instead of yielding this reference.
One could argue that this is a bug in `refs_for_each_ref_in()`. But the
question then becomes what the correct behaviour would be:
- Do we want to skip exact matches? In our case we certainly don't
want that, as the user has asked us to generate a bitmap for it.
- Do we want to yield the reference with the empty refname? That would
lead to a somewhat weird result.
Neither of these feel like viable options, so calling `BUG()` feels like
a sensible way out. The root cause ultimately is that we even try to
trim the whole refname in the first place. There are two possible ways
to fix this issue:
- We can fix the bug by using `refs_for_each_fullref_in()` instead,
which does not strip the prefix at all. Consequently, we would now
start to accept all references that start with the configured
prefix, including exact matches. So if we had "refs/heads/main", we
would both match "refs/heads/main" and "refs/heads/main-branch".
- Or we can fix the bug by appending a slash to the prefix if it
doesn't already have one. This would mean that we only match
ref hierarchies that start with this prefix.
While the first fix leaves the user with strictly _more_ configuration
options, we have already fixed a similar case in 10e8a9352b (refs.c:
stop matching non-directory prefixes in exclude patterns, 2025-03-06) by
using the second option. So for the sake of consistency, let's apply the
same fix here.
Clarify the documentation accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git alias names are limited to ASCII alphanumeric characters and
dashes because aliases are implemented as config variable names.
This prevents aliases being created in languages using characters outside that range.
Add support for arbitrary alias names by using config subsections:
[alias "förgrena"]
command = branch
The subsection name is matched as-is (case-sensitive byte comparison),
while the existing definition without a subsection (e.g.,
"[alias] co = checkout") remains case-insensitive for backward
compatibility. This uses existing config infrastructure since
subsections already support arbitrary bytes, and avoids introducing
Unicode normalization.
Also teach the help subsystem about the new syntax so that "git help
-a" properly lists subsection aliases and the autocorrect feature can
suggest them. Use utf8_strwidth() instead of strlen() for column
alignment so that non-ASCII alias names display correctly.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"auto filter" logic for large-object promisor remote.
* cc/lop-filter-auto:
fetch-pack: wire up and enable auto filter logic
promisor-remote: change promisor_remote_reply()'s signature
promisor-remote: keep advertised filters in memory
list-objects-filter-options: support 'auto' mode for --filter
doc: fetch: document `--filter=<filter-spec>` option
fetch: make filter_options local to cmd_fetch()
clone: make filter_options local to cmd_clone()
promisor-remote: allow a client to store fields
promisor-remote: refactor initialising field lists
Allow recording process ID of the process that holds the lock next
to a lockfile for diagnosis.
* pc/lockfile-pid:
lockfile: add PID file for debugging stale locks
A previous commit allowed a server to pass additional fields through
the "promisor-remote" protocol capability after the "name" and "url"
fields, specifically the "partialCloneFilter" and "token" fields.
Another previous commit, c213820c51 (promisor-remote: allow a client
to check fields, 2025-09-08), has made it possible for a client to
decide if it accepts a promisor remote advertised by a server based
on these additional fields.
Often though, it would be interesting for the client to just store in
its configuration files these additional fields passed by the server,
so that it can use them when needed.
For example if a token is necessary to access a promisor remote, that
token could be updated frequently only on the server side and then
passed to all the clients through the "promisor-remote" capability,
avoiding the need to update it on all the clients manually.
Storing the token on the client side makes sure that the token is
available when the client needs to access the promisor remotes for a
lazy fetch.
To allow this, let's introduce a new "promisor.storeFields"
configuration variable.
Note that for a partial clone filter, it's less interesting to have
it stored on the client. This is because a filter should be used
right away and we already pass a `--filter=<filter-spec>` option to
`git clone` when starting a partial clone. Storing the filter could
perhaps still be interesting for information purposes.
Like "promisor.checkFields" and "promisor.sendFields", the new
configuration variable should contain a comma or space separated list
of field names. Only the "partialCloneFilter" and "token" field names
are supported for now.
When a server advertises a promisor remote, for example "foo", along
with for example "token=XXXXX" to a client, and on the client side
"promisor.storeFields" contains "token", then the client will store
XXXXX for the "remote.foo.token" variable in its configuration file
and reload its configuration so it can immediately use this new
configuration variable.
A message is emitted on stderr to warn users when the config is
changed.
Note that even if "promisor.acceptFromServer" is set to "all", a
promisor remote has to be already configured on the client side for
some of its config to be changed. In any case no new remote is
configured and no new URL is stored.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow recording process ID of the process that holds the lock next
to a lockfile for diagnosis.
* pc/lockfile-pid:
lockfile: add PID file for debugging stale locks
There are many mentions of commands using inline-verbatim or
emphasis ('). We just mention the command themselves, not specific
invocations like `git am <opts>`. Let’s link to them instead.
There are also many such mentions which then link to the command right
afterwards. Simplify to just using a link.
Also remove “see <gitlink>” phrases where they have now already
been mentioned.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- git-add.adoc: Update the --force documentation for submodule behaviour
to be added even the given configuration ignore=all.
- gitmodules.adoc and config/submodule.adoc: The submodule config
ignore=all now need --force in order to update the index.
Signed-off-by: Claus Schneider(Eficode) <claus.schneider@eficode.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid local submodule repository directory paths overlapping with
each other by encoding submodule names before using them as path
components.
* ar/submodule-gitdir-tweak:
submodule: detect conflicts with existing gitdir configs
submodule: hash the submodule name for the gitdir path
submodule: fix case-folding gitdir filesystem collisions
submodule--helper: fix filesystem collisions by encoding gitdir paths
builtin/credential-store: move is_rfc3986_unreserved to url.[ch]
submodule--helper: add gitdir migration command
submodule: allow runtime enabling extensions.submodulePathConfig
submodule: introduce extensions.submodulePathConfig
builtin/submodule--helper: add gitdir command
submodule: always validate gitdirs inside submodule_name_to_gitdir
submodule--helper: use submodule_name_to_gitdir in add_submodule
Avoid local submodule repository directory paths overlapping with
each other by encoding submodule names before using them as path
components.
* ar/submodule-gitdir-tweak:
submodule: detect conflicts with existing gitdir configs
submodule: hash the submodule name for the gitdir path
submodule: fix case-folding gitdir filesystem collisions
submodule--helper: fix filesystem collisions by encoding gitdir paths
builtin/credential-store: move is_rfc3986_unreserved to url.[ch]
submodule--helper: add gitdir migration command
submodule: allow runtime enabling extensions.submodulePathConfig
submodule: introduce extensions.submodulePathConfig
builtin/submodule--helper: add gitdir command
submodule: always validate gitdirs inside submodule_name_to_gitdir
submodule--helper: use submodule_name_to_gitdir in add_submodule
When a lock file is held, it can be helpful to know which process owns
it, especially when debugging stale locks left behind by crashed
processes. Add an optional feature that creates a companion PID file
alongside each lock file, containing the PID of the lock holder.
For a lock file "foo.lock", the PID file is named "foo~pid.lock". The
tilde character is forbidden in refnames and allowed in Windows
filenames, which guarantees no collision with the refs namespace
(e.g., refs "foo" and "foo~pid" cannot both exist). The file contains
a single line in the format "pid <value>" followed by a newline.
The PID file is created when a lock is acquired (if enabled), and
automatically cleaned up when the lock is released (via commit or
rollback). The file is registered as a tempfile so it gets cleaned up
by signal and atexit handlers if the process terminates abnormally.
When a lock conflict occurs, the code checks for an existing PID file
and, if found, uses kill(pid, 0) to determine if the process is still
running. This allows providing context-aware error messages:
Lock is held by process 12345. Wait for it to finish, or remove
the lock file to continue.
Or for a stale lock:
Lock was held by process 12345, which is no longer running.
Remove the stale lock file to continue.
The feature is controlled via core.lockfilePid configuration (boolean).
Defaults to false. When enabled, PID files are created for all lock
operations.
Existing PID files are always read when displaying lock errors,
regardless of the core.lockfilePid setting. This ensures helpful
diagnostics even when the feature was previously enabled and later
disabled.
Signed-off-by: Paulo Casaretto <pcasaretto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Manually running
"git config submodule.<name>.gitdir .git/modules/<name>"
for each submodule can be impractical, so add a migration command to
submodule--helper to automatically create configs for all submodules
as required by extensions.submodulePathConfig.
The command calls create_default_gitdir_config() which validates the
gitdir paths before adding the configs.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new config `init.defaultSubmodulePathConfig` which allows
enabling `extensions.submodulePathConfig` for new submodules by
default (those created via git init or clone).
Important: setting init.defaultSubmodulePathConfig = true does
not globally enable `extensions.submodulePathConfig`. Existing
repositories will still have the extension disabled and will
require migration (for example via git submodule--helper command
added in the next commit).
Suggested-by: Patrick Steinhardt <ps@pks.im>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea of this extension is to abstract away the submodule gitdir
path implementation: everyone is expected to use the config and not
worry about how the path is computed internally, either in git or
other implementations.
With this extension enabled, the submodule.<name>.gitdir repo config
becomes the single source of truth for all submodule gitdir paths.
The submodule.<name>.gitdir config is added automatically for all new
submodules when this extension is enabled.
Git will throw an error if the extension is enabled and a config is
missing, advising users how to migrate. Migration is manual for now.
E.g. to add a missing config entry for an existing "foo" module:
git config submodule.foo.gitdir .git/modules/foo
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous wording:
> Path expansions are made the same way as for `core.excludesFile`.
required one to check the docs for 'core.excludesFile' and from there
the definition of the pathname variable type to understand the path
expansion behaviour of this variable. Instead, just link directly to the
pathname type.
This change is basically the same rewording as was done to
'core.excludesFile' in dca83abd (config: describe 'pathname' value
type, 2016-04-29).
Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While investigating the config options set by 'scalar' I noticed this
one wasn't documented.
Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both "git apply" and "git diff" learn a new whitespace error class,
"incomplete-line".
* jc/whitespace-incomplete-line:
attr: enable incomplete-line whitespace error for this project
diff: highlight and error out on incomplete lines
apply: check and fix incomplete lines
whitespace: allocate a few more bits and define WS_INCOMPLETE_LINE
apply: revamp the parsing of incomplete lines
diff: update the way rewrite diff handles incomplete lines
diff: call emit_callback ecbdata everywhere
diff: refactor output of incomplete line
diff: keep track of the type of the last line seen
diff: correct suppress_blank_empty hack
diff: emit_line_ws_markup() if/else style fix
whitespace: correct bit assignment comments
"git replay" (experimental) learned to perform ref updates itself
in a transaction by default, instead of emitting where each refs
should point at and leaving the actual update to another command.
* sa/replay-atomic-ref-updates:
replay: add replay.refAction config option
replay: make atomic ref updates the default behavior
replay: use die_for_incompatible_opt2() for option validation
Both "git apply" and "git diff" learn a new whitespace error class,
"incomplete-line".
* jc/whitespace-incomplete-line:
attr: enable incomplete-line whitespace error for this project
diff: highlight and error out on incomplete lines
apply: check and fix incomplete lines
whitespace: allocate a few more bits and define WS_INCOMPLETE_LINE
apply: revamp the parsing of incomplete lines
diff: update the way rewrite diff handles incomplete lines
diff: call emit_callback ecbdata everywhere
diff: refactor output of incomplete line
diff: keep track of the type of the last line seen
diff: correct suppress_blank_empty hack
diff: emit_line_ws_markup() if/else style fix
whitespace: correct bit assignment comments
- Switch the synopsis to a synopsis block which will automatically
format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Switch the synopsis to a synopsis block which will automatically
format placeholders in italics and keywords in monospace
- Use _<placeholder>_ instead of <placeholder> in the description
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git replay" (experimental) learned to perform ref updates itself
in a transaction by default, instead of emitting where each refs
should point at and leaving the actual update to another command.
* sa/replay-atomic-ref-updates:
replay: add replay.refAction config option
replay: make atomic ref updates the default behavior
replay: use die_for_incompatible_opt2() for option validation
Reserve a few more bits in the diff flags word to be used for future
whitespace rules. Add WS_INCOMPLETE_LINE without implementing the
behaviour (yet).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a configuration variable to control the default behavior of git replay
for updating references. This allows users who prefer the traditional
pipeline output to set it once in their config instead of passing
--ref-action=print with every command.
The config variable uses string values that mirror the behavior modes:
* replay.refAction = update (default): atomic ref updates
* replay.refAction = print: output commands for pipeline
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Elijah Newren <newren@gmail.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git maintenance" command learns the "geometric" strategy where it
avoids doing maintenance tasks that rebuilds everything from
scratch.
* ps/maintenance-geometric:
t7900: fix a flaky test due to git-repack always regenerating MIDX
builtin/maintenance: introduce "geometric" strategy
builtin/maintenance: make "gc" strategy accessible
builtin/maintenance: extend "maintenance.strategy" to manual maintenance
builtin/maintenance: run maintenance tasks depending on type
builtin/maintenance: improve readability of strategies
builtin/maintenance: don't silently ignore invalid strategy
builtin/maintenance: make the geometric factor configurable
builtin/maintenance: introduce "geometric-repack" task
builtin/gc: make `too_many_loose_objects()` reusable without GC config
builtin/gc: remove global `repack` variable
"Symlink symref" has been added to the list of things that will
disappear at Git 3.0 boundary.
* ps/symlink-symref-deprecation:
refs/files: deprecate writing symrefs as symbolic links
We have two different repacking strategies in Git:
- The "gc" strategy uses git-gc(1).
- The "incremental" strategy uses multi-pack indices and `git
multi-pack-index repack` to merge together smaller packfiles as
determined by a specific batch size.
The former strategy is our old and trusted default, whereas the latter
has historically been used for our scheduled maintenance. But both
strategies have their shortcomings:
- The "gc" strategy performs regular all-into-one repacks. Furthermore
it is rather inflexible, as it is not easily possible for a user to
enable or disable specific subtasks.
- The "incremental" strategy is not a full replacement for the "gc"
strategy as it doesn't know to prune stale data.
So today, we don't have a strategy that is well-suited for large repos
while being a full replacement for the "gc" strategy.
Introduce a new "geometric" strategy that aims to fill this gap. This
strategy invokes all the usual cleanup tasks that git-gc(1) does like
pruning reflogs and rerere caches as well as stale worktrees. But where
it differs from both the "gc" and "incremental" strategy is that it uses
our geometric repacking infrastructure exposed by git-repack(1) to
repack packfiles. The advantage of geometric repacking is that we only
need to perform an all-into-one repack when the object count in a repo
has grown significantly.
One downside of this strategy is that pruning of unreferenced objects is
not going to happen regularly anymore. Every geometric repack knows to
soak up all loose objects regardless of their reachability, and merging
two or more packs doesn't consider reachability, either. Consequently,
the number of unreachable objects will grow over time.
This is remedied by doing an all-into-one repack instead of a geometric
repack whenever we determine that the geometric repack would end up
merging all packfiles anyway. This all-into-one repack then performs our
usual reachability checks and writes unreachable objects into a cruft
pack. As cruft packs won't ever be merged during geometric repacks we
can thus phase out these objects over time.
Of course, this still means that we retain unreachable objects for far
longer than with the "gc" strategy. But the maintenance strategy is
intended especially for large repositories, where the basic assumption
is that the set of unreachable objects will be significantly dwarfed by
the number of reachable objects.
If this assumption is ever proven to be too disadvantageous we could for
example introduce a time-based strategy: if the largest packfile has not
been touched for longer than $T, we perform an all-into-one repack. But
for now, such a mechanism is deferred into the future as it is not clear
yet whether it is needed in the first place.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>