Add a new config `hook.forceStdoutToStderr` which allows enabling
extensions.hookStdoutToStderr by default at runtime, both for new
and existing repositories.
This makes it easier for users to enable hook parallelization for
hooks like pre-push by enforcing output consistency. See previous
commit for a more in-depth explanation & alternatives considered.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a hook.<event>.jobs count config that allows users to override the
global hook.jobs setting for specific hook events.
This allows finer-grained control over parallelism on a per-event basis.
For example, to run `post-receive` hooks with up to 4 parallel jobs
while keeping other events at their global default:
[hook]
post-receive.jobs = 4
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hooks always run in sequential order due to the hardcoded jobs == 1
passed to run_process_parallel(). Remove that hardcoding to allow
users to run hooks in parallel (opt-in).
Users need to decide which hooks to run in parallel, by specifying
"parallel = true" in the config, because git cannot know if their
specific hooks are safe to run or not in parallel (for e.g. two hooks
might write to the same file or call the same program).
Some hooks are unsafe to run in parallel by design: these will marked
in the next commit using RUN_HOOKS_OPT_INIT_FORCE_SERIAL.
The hook.jobs config specifies the default number of jobs applied to all
hooks which have parallelism enabled.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hook.jobs config is a global way to set hook parallelization for
all hooks, in the sense that it is not per-event nor per-hook.
Finer-grained configs will be added in later commits which can override
it, for e.g. via a per-event type job options. Next commits will also
add to this item's documentation.
Parse hook.jobs config key in hook_config_lookup_all() and store its
value in hook_all_config_cb.jobs, then transfer it into
hook_config_cache.jobs after the config pass completes.
This is mostly plumbing and the cached value is not yet used.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the raw `struct strmap *hook_config_cache` in `struct repository`
with a `struct hook_config_cache` which wraps the strmap in a named field.
Replace the bare `char *command` util pointer stored in each string_list
item with a heap-allocated `struct hook_config_cache_entry` that carries
that command string.
This is just a refactoring with no behavior changes, to give the cache
struct room to grow so it can carry the additional hook metadata we'll
be adding in the following commits.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the hook.[hc] library to parse configs to populate the list of
hooks to run for a given event.
Multiple commands can be specified for a given hook by providing
"hook.<friendly-name>.command = <path-to-hook>" and
"hook.<friendly-name>.event = <hook-event>" lines.
Hooks will be started in config order of the "hook.<name>.event"
lines and will be run sequentially (.jobs == 1) like before.
Running the hooks in parallel will be enabled in a future patch.
The "traditional" hook from the hookdir is run last, if present.
A strmap cache is added to struct repository to avoid re-reading
the configs on each rook run. This is useful for hooks like the
ref-transaction which gets executed multiple times per process.
Examples:
$ git config --get-regexp "^hook\."
hook.bar.command=~/bar.sh
hook.bar.event=pre-commit
# Will run ~/bar.sh, then .git/hooks/pre-commit
$ git hook run pre-commit
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit introduced an ability to run multiple commands for
hook events and next commit will introduce the ability to define hooks
from configs, in addition to the "traditional" hooks from the hookdir.
Introduce a new command "git hook list" to make inspecting hooks easier
both for users and for the tests we will add.
Further commits will expand on this, e.g. by adding a -z output mode.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hooks are limited to run one command (the default from the hookdir) for
each event. This limitation makes it impossible to run multiple commands
via config files, which the next commits will add.
Implement the ability to run a list of hooks in hook.[ch]. For now, the
list contains only one entry representing the "default" hook from the
hookdir, so there is no user-visible change in this commit.
All hook commands still run sequentially like before. A separate patch
series will enable running them in parallel.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some hooks use opaque structs to keep internal state between callbacks.
Because hooks ran sequentially (jobs == 1) with one command per hook,
these internal states could be allocated on the stack for each hook run.
Next commits add the ability to run multiple commands for each hook, so
the states cannot be shared or stored on the stack anymore, especially
since down the line we will also enable parallel execution (jobs > 1).
Add alloc/free helpers for each hook, doing a "deep" alloc/init & free
of their internal opaque struct.
The alloc callback takes a context pointer, to initialize the struct at
at the time of resource acquisition.
These callbacks must always be provided together: no alloc without free
and no free without alloc, otherwise a BUG() is triggered.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow the API callers to specify the number of jobs across which
hook execution can be parallelized. It defaults to 1 and no hook
currently changes it, so all hooks run sequentially as before.
This allows us to both pave the way for parallel hook execution
(that will be a follow-up patch series building upon this) and to
finish the API conversion of builtin/receive-pack.c, keeping the
output async sideband thread ("muxer") design as Peff suggested.
When .jobs==1 nothing changes, the "copy_to_sideband" async thread
still outputs directly via sideband channel 2, keeping the current
(mostly) real-time output characteristics, avoids unnecessary poll
delays or deadlock risks.
When .jobs > 1, a more complex muxer is needed to buffer the hook
output and avoid interleaving. After working on this mux I quickly
realized I was re-implementing run-command with ungroup=0 so that
idea was dropped in favor of run-command which outputs to stderr.
In other words, run-command itself already can buffer/deinterleave
pp child outputs (ungroup=0), so we can just connect its stderr to
the sideband async task when jobs > 1.
Maybe it helps to illustrate how it works with ascii graphics:
[ Sequential (jobs = 1) ] [ Parallel (jobs > 1) ]
+--------------+ +--------+ +--------+
| Hook Process | | Hook 1 | | Hook 2 |
+--------------+ +--------+ +--------+
| | |
| stderr (inherited) | stderr pipe |
| | (captured) |
v v v
+-------------------------------------------------------------+
| Parent Process |
| |
| (direct write) [run-command (buffered)] |
| | | |
| | | writes |
| v v |
| +-------------------------------------------+ |
| | stderr (FD 2) | |
| +-------------------------------------------+ |
| | |
| | (dup2'd to pipe) |
| v |
| +-----------------------+ |
| | sideband async thread | |
| +-----------------------+ |
+-------------------------------------------------------------+
When use_sideband == 0, the sideband async thread is missing, so
this same architecture just outputs via the parent stderr stream.
See the following commits for the hook API conversions doing this,
using pre-existing sideband thread logic from `copy_to_sideband`.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hook API assumes that all hooks merge stdout to stderr.
This assumption is proven wrong by pre-push: some of its users
actually expect separate stdout and stderr streams and merging
them will cause a regression.
Therefore this adds a mechanism to allow pre-push to separate
the streams, which will be used in the next commit.
The mechanism is generic via struct run_hooks_opt just in case
there are any more surprise exceptions like this.
Reported-by: Chris Darroch <chrisd@apache.org>
Suggested-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds a callback mechanism for feeding stdin to hooks alongside
the existing path_to_stdin (which slurps a file's content to stdin).
The advantage of this new callback is that it can feed stdin without
going through the FS layer. This helps when feeding large amount of
data and uses the run-command parallel stdin callback introduced in
the preceding commit.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit f406b89552,
reversing changes made to 1627809eef.
It seems to have caused a few regressions, two of the three known
ones we have proposed solutions for. Let's give ourselves a bit
more room to maneuver during the pre-release freeze period and
restart once the 2.53 ships.
Some server-side hooks will require capturing output to send over
sideband instead of printing directly to stderr. Expose that capability.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling run_process_parallel() in run_hooks_opt(), the
ungroup option is currently hardcoded to .ungroup = 1.
This causes problems when ungrouping should be disabled, for
example when sideband-reading collated output from child hooks,
because sideband-reading and ungrouping are mutually exclusive.
Thus a new hook.h option is added to allow overriding.
The existing ungroup=1 behavior is preserved in the run_hooks()
API and the "hook run" command. We could modify these to take
an option if necessary, so I added two code comments there.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds a callback mechanism for feeding stdin to hooks alongside
the existing path_to_stdin (which slurps a file's content to stdin).
The advantage of this new callback is that it can feed stdin without
going through the FS layer. This helps when feeding large amount of
data and uses the run-command parallel stdin callback introduced in
the preceding commit.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We implicitly depend on `the_repository` in our hook subsystem because
we use `strbuf_git_path()` to compute hook paths. Remove this dependency
by accepting a `struct repository` as parameter instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some varargs functions that use NULL-terminated parameter list were
missing __attributes__ ((sentinel)) aka LAST_ARG_MUST_BE_NULL.
Add them.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the invocation of the 'post-rewrite' hook run by 'git am' to
use the hook.h library. To do this we need to add a "path_to_stdin"
member to "struct run_hooks_opt".
In our API this is supported by asking for a file path, rather
than by reading stdin. Reading directly from stdin would involve caching
the entire stdin (to memory or to disk) once the hook API is made to
support "jobs" larger than 1, along with support for executing N hooks
at a time (i.e. the upcoming config-based hooks).
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a Time-of-check to time-of-use (TOCTOU) race in code added in
680ee550d7 (commit: skip discarding the index if there is no
pre-commit hook, 2017-08-14).
This obscure race condition can occur if we e.g. ran the "pre-commit"
hook and it modified the index, but hook_exists() returns false later
on (e.g., because the hook itself went away, the directory became
unreadable, etc.). Then we won't call discard_cache() when we should
have.
The race condition itself probably doesn't matter, and users would
have been unlikely to run into it in practice. This problem has been
noted on-list when 680ee550d7 was discussed[1], but had not been
fixed.
This change is mainly intended to improve the readability of the code
involved, and to make reasoning about it more straightforward. It
wasn't as obvious what we were trying to do here, but by having an
"invoked_hook" it's clearer that e.g. our discard_cache() is happening
because of the earlier hook execution.
Let's also change this for the push-to-checkout hook. Now instead of
checking if the hook exists and either doing a push to checkout or a
push to deploy we'll always attempt a push to checkout. If the hook
doesn't exist we'll fall back on push to deploy. The same behavior as
before, without the TOCTOU race. See 0855331941 (receive-pack:
support push-to-checkout hook, 2014-12-01) for the introduction of the
previous behavior.
This leaves uses of hook_exists() in two places that matter. The
"reference-transaction" check in refs.c, see 6754159767 (refs:
implement reference transaction hook, 2020-06-19), and the
"prepare-commit-msg" hook, see 66618a50f9 (sequencer: run
'prepare-commit-msg' hook, 2018-01-24).
In both of those cases we're saving ourselves CPU time by not
preparing data for the hook that we'll then do nothing with if we
don't have the hook. So using this "invoked_hook" pattern doesn't make
sense in those cases.
The "reference-transaction" and "prepare-commit-msg" hook also aren't
racy. In those cases we'll skip the hook runs if we race with a new
hook being added, whereas in the TOCTOU races being fixed here we were
incorrectly skipping the required post-hook logic.
1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the running of the 'post-checkout' hook away from run-command.h
to the new hook.h library in builtin/worktree.c. For this special case
we need a change to the hook API to teach it to run the hook from a
given directory.
We cannot skip the "absolute_path" flag and just check if "dir" is
specified as we'd then fail to find our hook in the new dir we'd
chdir() to. We currently don't have a use-case for running a hook not
in our "base" repository at a given absolute path, so let's have "dir"
imply absolute_path(find_hook(hook_name)).
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a run_hooks_l() wrapper, we'll use it in subsequent commits for
the simple cases of wanting to run a single hook under a given name
along with a list of arguments.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a run_hooks() wrapper, we'll use it in subsequent commits for the
simple cases of wanting to run a single hook under a given name,
without providing options such as "env" or "args".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.
Most of our hooks require more complex functionality than this, but
let's start with the bare minimum required to support our simplest
hooks.
In terms of implementation the usage_with_options() and "goto usage"
pattern here mirrors that of
builtin/{commit-graph,multi-pack-index}.c.
Some of the implementation here, such as a function being named
run_hooks_opt() when it's tasked with running one hook, to using the
run_processes_parallel_tr2() API to run with jobs=1 is somewhere
between a bit odd and and an overkill for the current features of this
"hook run" command and the hook.[ch] API.
This code will eventually be able to run multiple hooks declared in
config in parallel, by starting out with these names and APIs we
reduce the later churn of renaming functions, switching from the
run_command() to run_processes_parallel_tr2() API etc.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a boolean version of the find_hook() function for those callers
who are only interested in checking whether the hook exists, not what
the path to it is.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the find_hook() function from run-command.c to a new hook.c
library. This change establishes a stub library that's pretty
pointless right now, but will see much wider use with Emily Shaffer's
upcoming "configuration-based hooks" series.
Eventually all the hook related code will live in hook.[ch]. Let's
start that process by moving the simple find_hook() function over
as-is.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>