A linked worktree's .git file is a "gitfile" pointing at the
.git/worktrees/<id> directory within the repository. When `git init
--separate-git-dir=<path>` is used on an existing repository to relocate
the repository's .git/ directory to a different location, it neglects to
update the .git files of linked worktrees, thus breaking the worktrees
by making it impossible for them to locate the repository. Fix this by
teaching --separate-git-dir to repair the .git file of each linked
worktree to point at the new repository location.
Reported-by: Henré Botha <henrebotha@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The .git/worktrees/<id>/gitdir file points at the location of a linked
worktree's .git file. Its content must be of the form
/path/to/worktree/.git (from which the location of the worktree itself
can be derived by stripping the "/.git" suffix). If the gitdir file is
deleted or becomes corrupted or outdated, then Git will be unable to
find the linked worktree. An easy way for the gitdir file to become
outdated is for the user to move the worktree manually (without using
"git worktree move"). Although it is possible to manually update the
gitdir file to reflect the new linked worktree location, doing so
requires a level of knowledge about worktree internals beyond what a
user should be expected to know offhand.
Therefore, teach "git worktree repair" how to repair broken or outdated
.git/worktrees/<id>/gitdir files automatically. (For this to work, the
command must either be invoked from within the worktree whose gitdir
file requires repair, or from within the main or any linked worktree by
providing the path of the broken worktree as an argument to "git
worktree repair".)
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The .git file in a linked worktree is a "gitfile" which points back to
the .git/worktrees/<id> entry in the main worktree or bare repository.
If a worktree's .git file is deleted or becomes corrupted or outdated,
then the linked worktree won't know how to find the repository or any of
its own administrative files (such as 'index', 'HEAD', etc.). An easy
way for the .git file to become outdated is for the user to move the
main worktree or bare repository. Although it is possible to manually
update each linked worktree's .git file to reflect the new repository
location, doing so requires a level of knowledge about worktree
internals beyond what a user should be expected to know offhand.
Therefore, teach "git worktree repair" how to repair broken or outdated
worktree .git files automatically. (For this to work, the command must
be invoked from within the main worktree or bare repository, or from
within a worktree which has not become disconnected from the repository
-- such as one which was created after the repository was moved.)
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's refactor code adding a new `write_in_file()` function
that opens a file for writing a message and closes it and a
wrapper for writing mode.
This helper will be used in later steps and makes the code
simpler and easier to understand.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Following 'enum bisect_error' vocabulary, return variable 'res' is
always non-positive.
Let's use '-res' instead of 'abs(res)' to make the code clearer.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cmd_bisect__helper() function, if an invalid or no
subcommand is passed there is a BUG.
BUG() out instead of returning an error.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a repository has an alternate object directory configured, callers
can traverse through each alternate's MIDX by walking the '->next'
pointer.
But, when 'prepare_multi_pack_index_one()' loads multiple MIDXs, it
places the new ones at the front of this pointer chain, not at the end.
This can be confusing for callers such as 'git repack -ad', causing test
failures like in t7700.6 with 'GIT_TEST_MULTI_PACK_INDEX=1'.
The occurs when dropping a pack known to the local MIDX with alternates
configured that have their own MIDX. Since the alternate's MIDX is
returned via 'get_multi_pack_index()', 'midx_contains_pack()' returns
true (which is correct, since it traverses through the '->next' pointer
to find the MIDX in the chain that does contain the requested object).
But, we call 'clear_midx_file()' on 'the_repository', which drops the
MIDX at the path of the first MIDX in the chain, which (in the case of
t7700.6 is the one in the alternate).
This patch addresses that by:
- placing the local MIDX first in the chain when calling
'prepare_multi_pack_index_one()', and
- introducing a new 'get_local_multi_pack_index()', which explicitly
returns the repository-local MIDX, if any.
Don't impose an additional order on the MIDX's '->next' pointer beyond
that the first item in the chain must be local if one exists so that we
avoid a quadratic insertion.
Likewise, use 'get_local_multi_pack_index()' in
'remove_redundant_pack()' to fix the formerly broken t7700.6 when run
with 'GIT_TEST_MULTI_PACK_INDEX=1'.
Finally, note that the MIDX ordering invariant is only preserved by the
insertion order in 'prepare_packed_git()', which traverses through the
ODB's '->next' pointer, meaning we visit the local object store first.
This fragility makes this an undesirable long-term solution if more
callers are added, but it is acceptable for now since this is the only
caller.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The positional arguments are specified in this order: "bad" then "good".
To avoid confusion, the options above the positional arguments
are now specified in the same order. They can still be specified in any
order since they're options, not positional arguments.
Signed-off-by: Hugo Locurcio <hugo.locurcio@hugo.pro>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* jk/leakfix:
submodule--helper: fix leak of core.worktree value
config: fix leak in git_config_get_expiry_in_days()
config: drop git_config_get_string_const()
config: fix leaks from git_config_get_string_const()
checkout: fix leak of non-existent branch names
submodule--helper: use strbuf_release() to free strbufs
clear_pattern_list(): clear embedded hashmaps
Add a new multi-valued config variable "receive.procReceiveRefs"
for `receive-pack` command, like the follows:
git config --system --add receive.procReceiveRefs refs/for
git config --system --add receive.procReceiveRefs refs/drafts
If the specific prefix strings given by the config variables match the
reference names of the commands which are sent from git client to
`receive-pack`, these commands will be executed by an external hook
(named "proc-receive"), instead of the internal `execute_commands`
function.
For example, if it is set to "refs/for", pushing to a reference such as
"refs/for/master" will not create or update reference "refs/for/master",
but may create or update a pull request directly by running the hook
"proc-receive".
Optional modifiers can be provided in the beginning of the value to
filter commands for specific actions: create (a), modify (m),
delete (d). A `!` can be included in the modifiers to negate the
reference prefix entry. E.g.:
git config --system --add receive.procReceiveRefs ad:refs/heads
git config --system --add receive.procReceiveRefs !:refs/heads
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new introduced "proc-receive" hook may handle a command for a
pseudo-reference with a zero-old as its old-oid, while the hook may
create or update a reference with different name, different new-oid,
and different old-oid (the reference may exist already with a non-zero
old-oid). Current "report-status" protocol cannot report the status for
such reference rewrite.
Add new capability "report-status-v2" and new report protocol which is
not backward compatible for report of git-push.
If a user pushes to a pseudo-reference "refs/for/master/topic", and
"receive-pack" creates two new references "refs/changes/23/123/1" and
"refs/changes/24/124/1", for client without the knowledge of
"report-status-v2", "receive-pack" will only send "ok/ng" directives in
the report, such as:
ok ref/for/master/topic
But for client which has the knowledge of "report-status-v2",
"receive-pack" will use "option" directives to report more attributes
for the reference given by the above "ok/ng" directive.
ok refs/for/master/topic
option refname refs/changes/23/123/1
option new-oid <new-oid>
ok refs/for/master/topic
option refname refs/changes/24/124/1
option new-oid <new-oid>
The client will report two new created references to the end user.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commands are fed to the "post-receive" hook, report options will
be parsed and the real old-oid, new-oid, reference name will feed to
the "post-receive" hook.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git calls an internal `execute_commands` function to handle commands
sent from client to `git-receive-pack`. Regardless of what references
the user pushes, git creates or updates the corresponding references if
the user has write-permission. A contributor who has no
write-permission, cannot push to the repository directly. So, the
contributor has to write commits to an alternate location, and sends
pull request by emails or by other ways. We call this workflow as a
distributed workflow.
It would be more convenient to work in a centralized workflow like what
Gerrit provided for some cases. For example, a read-only user who
cannot push to a branch directly can run the following `git push`
command to push commits to a pseudo reference (has a prefix "refs/for/",
not "refs/heads/") to create a code review.
git push origin \
HEAD:refs/for/<branch-name>/<session>
The `<branch-name>` in the above example can be as simple as "master",
or a more complicated branch name like "foo/bar". The `<session>` in
the above example command can be the local branch name of the client
side, such as "my/topic".
We cannot implement a centralized workflow elegantly by using
"pre-receive" + "post-receive", because Git will call the internal
function "execute_commands" to create references (even the special
pseudo reference) between these two hooks. Even though we can delete
the temporarily created pseudo reference via the "post-receive" hook,
having a temporary reference is not safe for concurrent pushes.
So, add a filter and a new handler to support this kind of workflow.
The filter will check the prefix of the reference name, and if the
command has a special reference name, the filter will turn a specific
field (`run_proc_receive`) on for the command. Commands with this filed
turned on will be executed by a new handler (a hook named
"proc-receive") instead of the internal `execute_commands` function.
We can use this "proc-receive" command to create pull requests or send
emails for code review.
Suggested by Junio, this "proc-receive" hook reads the commands,
push-options (optional), and send result using a protocol in pkt-line
format. In the following example, the letter "S" stands for
"receive-pack" and letter "H" stands for the hook.
# Version and features negotiation.
S: PKT-LINE(version=1\0push-options atomic...)
S: flush-pkt
H: PKT-LINE(version=1\0push-options...)
H: flush-pkt
# Send commands from server to the hook.
S: PKT-LINE(<old-oid> <new-oid> <ref>)
S: ... ...
S: flush-pkt
# Send push-options only if the 'push-options' feature is enabled.
S: PKT-LINE(push-option)
S: ... ...
S: flush-pkt
# Receive result from the hook.
# OK, run this command successfully.
H: PKT-LINE(ok <ref>)
# NO, I reject it.
H: PKT-LINE(ng <ref> <reason>)
# Fall through, let 'receive-pack' to execute it.
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option fall-through)
# OK, but has an alternate reference. The alternate reference name
# and other status can be given in options
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option refname <refname>)
H: PKT-LINE(option old-oid <old-oid>)
H: PKT-LINE(option new-oid <new-oid>)
H: PKT-LINE(option forced-update)
H: ... ...
H: flush-pkt
After receiving a command, the hook will execute the command, and may
create/update different reference. For example, a command for a pseudo
reference "refs/for/master/topic" may create/update different reference
such as "refs/pull/123/head". The alternate reference name and other
status are given in option lines.
The list of commands returned from "proc-receive" will replace the
relevant commands that are sent from user to "receive-pack", and
"receive-pack" will continue to run the "execute_commands" function and
other routines. Finally, the result of the execution of these commands
will be reported to end user.
The reporting function from "receive-pack" to "send-pack" will be
extended in latter commit just like what the "proc-receive" hook reports
to "receive-pack".
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'grep' check in test 4 of t7421 resulted in the failure of t7421 on
Windows due to a different error message
error: cannot spawn git: No such file or directory
instead of
fatal: exec 'rev-parse': cd to 'my-subm' failed: No such file or directory
Tighten up the check to compute 'src_abbrev' by guarding the
'verify_submodule_committish()' call using `p->status !='D'`, so that
the former isn't called in case of non-existent submodule directory,
consequently, there is no such error message on any execution
environment. The same need not be implemented for 'dst_abbrev' and is
rather redundant since the conditional 'if (S_ISGITLINK(p->mod_dst))'
already guards the 'verify_submodule_committish()' when we have a
status of 'D'.
Therefore, eliminate the 'grep' check in t7421. Instead, verify the
absence of an error message by doing a 'test_must_be_empty' on the
file containing the error.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Worktree administrative files can become corrupted or outdated due to
external factors. Although, it is often possible to recover from such
situations by hand-tweaking these files, doing so requires intimate
knowledge of worktree internals. While information necessary to make
such repairs manually can be obtained from git-worktree.txt and
gitrepository-layout.txt, we can assist users more directly by teaching
git-worktree how to repair its administrative files itself (at least to
some extent). Therefore, add a "git worktree repair" command which
attempts to correct common problems which may arise due to factors
beyond Git's control.
At this stage, the "repair" command is a mere skeleton; subsequent
commits will flesh out the functionality.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allocate a child_env strvec but never free its memory. Instead, let's
just use the strvec that our child_process struct provides, which is
cleaned up automatically when we run the command.
And while we're moving the initialization of the child_process around,
let's switch it to use the official init function (zero-initializing it
works OK, since strvec is happy enough with that, but it sets a bad
example).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 525e18c04b (midx: clear midx on repack, 2018-07-12), 'git repack'
learned to remove a multi-pack-index file if it added or removed a pack
from the object store.
This mechanism is a little over-eager, since it is only necessary to
drop a MIDX if 'git repack' removes a pack that the MIDX references.
Adding a pack outside of the MIDX does not require invalidating the
MIDX, and likewise for removing a pack the MIDX does not know about.
Teach 'git repack' to check for this by loading the MIDX, and checking
whether the to-be-removed pack is known to the MIDX. This requires a
slightly odd alternation to a test in t5319, which is explained with a
comment. A new test is added to show that the MIDX is left alone when
both packs known to it are marked as .keep, but two packs unknown to it
are removed and combined into one new pack.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The definitions of 'verify_submodule_committish()' and
'print_submodule_summary()' had wrong styling in terms of the asterisk
placement. Amend them.
Also, the warning printed in case of an unexpected file mode printed the
mode in decimal. Print it in octal for enhanced readability.
Reported-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The purpose of "git init --separate-git-dir" is to initialize a
new project with the repository separate from the working tree,
or, in the case of an existing project, to move the repository
(the .git/ directory) out of the working tree. It does not make
sense to use --separate-git-dir with a bare repository for which
there is no working tree, so disallow its use with bare
repositories.
* es/init-no-separate-git-dir-in-bare:
init: disallow --separate-git-dir with bare repository
A subsequent commit will make the quantum of work smaller, necessitating
more locking. This commit allows resolve_delta() to be called outside
the lock.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is refactoring 2 of 2 to simplify struct base_data.
Whenever we make a struct base_data, immediately calculate its delta
children. This eliminates confusion as to when the
{ref,ofs}_{first,last} fields are initialized.
Before this patch, the delta children were calculated at the last
possible moment. This allowed the members of struct base_data to be
populated in any order, superficially useful when we have the object
contents before the struct object_entry. But this makes reasoning about
the state of struct base_data more complicated, hence this patch.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is refactoring 1 of 2 to simplify struct base_data.
In index-pack, each thread maintains a doubly-linked list of the delta
chain that it is currently processing (the "base" and "child" pointers
in struct base_data). When a thread exceeds the delta base cache limit
and needs to reclaim memory, it uses the "child" pointers to traverse
the lineage, reclaiming the memory of the eldest delta bases first.
A subsequent patch will perform memory reclaiming in a different way and
will thus no longer need the "child" pointer. Because the "child"
pointer is redundant even now, remove it so that the aforementioned
subsequent patch will be clearer. In the meantime, reclaim memory in the
reverse order of the "base" pointers.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
find_{ref,ofs}_delta_{,children} take an enum object_type parameter, but
the object type is already present in the name of the function. Remove
that parameter from these functions.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pathspec given to git checkout and git restore is used with both
tree_entry_interesting (via read_tree_recursive) and match_pathspec
(via ce_path_match). The latter effectively only supports recursive
matching regardless of the value of the pathspec flag "recursive",
which is unset here.
That causes different match results for pathspecs with wildcards, and
can lead checkout and restore in no-overlay mode to remove entries
instead of modifying them. Enable recursive matching for both checkout
and restore to make matching consistent.
Setting the flag in checkout_main() technically also affects git switch,
but since that command doesn't accept pathspecs at all this has no
actual consequence.
Reported-by: Sergii Shkarnikov <sergii.shkarnikov@globallogic.com>
Initial-test-by: Sergii Shkarnikov <sergii.shkarnikov@globallogic.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b8a2486f15 (index-pack: support multithreaded delta resolving,
2012-05-06) describes an experiment that shows that setting the number
of threads for index-pack higher than 3 does not help.
I repeated that experiment using a more modern version of Git and a more
modern CPU and got different results.
Here are timings for p5302 against linux.git run on my laptop, a Core
i9-9880H with 8 cores plus hyperthreading (so online-cpus returns 16):
5302.3: index-pack 0 threads 256.28(253.41+2.79)
5302.4: index-pack 1 threads 257.03(254.03+2.91)
5302.5: index-pack 2 threads 149.39(268.34+3.06)
5302.6: index-pack 4 threads 94.96(294.10+3.23)
5302.7: index-pack 8 threads 68.12(339.26+3.89)
5302.8: index-pack 16 threads 70.90(655.03+7.21)
5302.9: index-pack default number of threads 116.91(290.05+3.21)
You can see that wall-clock times continue to improve dramatically up to
the number of cores, but bumping beyond that (into hyperthreading
territory) does not help (and in fact hurts a little).
Here's the same experiment on a machine with dual Xeon 6230's, totaling
40 cores (80 with hyperthreading):
5302.3: index-pack 0 threads 310.04(302.73+6.90)
5302.4: index-pack 1 threads 310.55(302.68+7.40)
5302.5: index-pack 2 threads 178.17(304.89+8.20)
5302.6: index-pack 5 threads 99.53(315.54+9.56)
5302.7: index-pack 10 threads 72.80(327.37+12.79)
5302.8: index-pack 20 threads 60.68(357.74+21.66)
5302.9: index-pack 40 threads 58.07(454.44+67.96)
5302.10: index-pack 80 threads 59.81(720.45+334.52)
5302.11: index-pack default number of threads 134.18(309.32+7.98)
The results are similar; things stop improving at 40 threads. Curiously,
going from 20 to 40 really doesn't help much, either (and increases CPU
time considerably). So that may represent an actual barrier to
parallelism, where we lose out due to context-switching and loss of
cache locality, but don't reap the wall-clock benefits due to contention
of our coarse-grained locks.
So what's a good default value? It's clear that the current cap of 3 is
too low; our default values are 42% and 57% slower than the best times
on each machine. The results on the 40-core machine imply that 20
threads is an actual barrier regardless of the number of cores, so we'll
take that as a maximum. We get the best results on these machines at
half of the online-cpus value. That's presumably a result of the
hyperthreading. That's common on multi-core Intel processors, but not
necessarily elsewhere. But if we take it as an assumption, we can
perform optimally on hyperthreaded machines and still do much better
than the status quo on other machines, as long as we never half below
the current value of 3.
So that's what this patch does.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pseudorefs move to a different ref storage mechanism, pseudorefs no longer
can be removed with 'rm'. Instead, suggest a "update-ref -d" command, which will
work regardless of ref storage backend.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check for existence and delete CHERRY_PICK_HEAD through ref functions.
This will help cherry-pick work with alternate ref storage backends.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few end-user facing messages have been updated to be
hash-algorithm agnostic.
* jc/object-names-are-not-sha-1:
messages: avoid SHA-1 in end-user facing messages
The previous commit introduced --ignore-date flag to rebase -i, but the
name is rather vague as it does not say whether the author date or the
committer date is ignored. Add an alias to convey the precise purpose.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rebase is implemented with two different backends - 'apply' and
'merge' each of which support a different set of options. In
particular the apply backend supports a number of options implemented
by 'git am' that are not implemented in the merge backend. This means
that the available options are different depending on which backend is
used which is confusing. This patch adds support for the --ignore-date
option to the merge backend. This option uses the current time as the
author date rather than reusing the original author date when
rewriting commits. We take care to handle the combination of
--ignore-date and --committer-date-is-author-date in the same way as
the apply backend.
Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The dir structure seemed to have a number of leaks and problems around
it. First I noticed that parent_hashmap and recursive_hashmap were
being leaked (though Peff noticed and submitted fixes before me). Then
I noticed in the previous commit that clear_directory() was only taking
responsibility for a subset of fields within dir_struct, despite the
fact that entries[] and ignored[] we allocated internally to dir.c.
That, of course, resulted in many callers either leaking or haphazardly
trying to free these arrays and their contents.
Digging further, I found that despite the pretty clear documentation
near the top of dir.h that folks were supposed to call clear_directory()
when the user no longer needed the dir_struct, there were four callers
that didn't bother doing that at all. However, two of them clearly
thought about leaks since they had an UNLEAK(dir) directive, which to me
suggests that the method to free the data was too unclear. I suspect
the non-obviousness of the API and its holes led folks to avoid it,
which then snowballed into further problems with the entries[],
ignored[], parent_hashmap, and recursive_hashmap problems.
Rename clear_directory() to dir_clear() to be more in line with other
data structures in git, and introduce a dir_init() to handle the
suggested memsetting of dir_struct to all zeroes. I hope that a name
like "dir_clear()" is more clear, and that the presence of dir_init()
will provide a hint to those looking at the code that they need to look
for either a dir_clear() or a dir_free() and lead them to find
dir_clear().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The calling convention for the dir API is supposed to end with a call to
clear_directory() to free up no longer needed memory. However,
clear_directory() didn't free dir->entries or dir->ignored. I believe
this was an oversight, but a number of callers noticed memory leaks and
started free'ing these. Unfortunately, they did so somewhat haphazardly
(sometimes freeing the entries in the arrays, and sometimes only
free'ing the arrays themselves). This suggests the callers weren't
trying to make sure any possible memory used might be free'd, but just
the memory they noticed their usecase definitely had allocated.
Fix this mess by moving all the duplicated free'ing logic into
clear_directory(). End by resetting dir to a pristine state so it could
be reused if desired.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that Git has switched to using a subprocess to lazy-fetch missing
objects, remove the no_dependents code as it is no longer used.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "fetch", get_ref_map() iterates over all refs to populate
"existing_refs" in order to populate peer_ref->old_oid in the returned
refmap, even if the refmap has no peer_ref set - which is the case when
only literal hashes (i.e. no refs by name) are fetched.
Iterating over refs causes the targets of those refs to be checked for
existence. Avoiding this is especially important when we use "git fetch"
to perform lazy fetches in a partial clone because a target of such a
ref may need to be itself lazy-fetched (and otherwise causing an
infinite loop).
Therefore, avoid populating "existing_refs" until necessary. With this
patch, because Git lazy-fetches objects by literal hashes (to be done in
a subsequent commit), it will then be able to guarantee avoiding reading
targets of refs.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "fetch", there are two parameters submodule_fetch_jobs_config and
recurse_submodules that can be set in a variety of ways: through
.gitmodules, through .git/config, and through the command line.
Currently "fetch" handles this by first reading .gitmodules, then
reading .git/config (allowing it to overwrite existing values), then
reading the command line (allowing it to overwrite existing values).
Notice that we can avoid reading .gitmodules if .git/config and/or the
command line already provides us with what we need. In addition, if
recurse_submodules is found to be "no", we do not need the value of
submodule_fetch_jobs_config.
Avoiding reading .gitmodules is especially important when we use "git
fetch" to perform lazy fetches in a partial clone because the
.gitmodules file itself might need to be lazy fetched (and otherwise
causing an infinite loop).
In light of all this, avoid reading .gitmodules until necessary. When
reading it, we may only need one of the two parameters it provides, so
teach fetch_config_from_gitmodules() to support NULL arguments. With
this patch, users (including Git itself when invoking "git fetch" to
lazy-fetch) will be able to guarantee avoiding reading .gitmodules by
passing --recurse-submodules=no.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a subsequent patch, partial clones will be taught to fetch missing
objects using a "git fetch" subprocess. Because the number of objects
fetched may be too numerous to fit on the command line, teach "fetch" to
accept refspecs passed through stdin.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you run fetch but record the result in remote-tracking branches,
and either if you do nothing with the fetched refs (e.g. you are
merely mirroring) or if you always work from the remote-tracking
refs (e.g. you fetch and then merge origin/branchname separately),
you can get away with having no FETCH_HEAD at all.
Teach "git fetch" a command line option "--[no-]write-fetch-head".
The default is to write FETCH_HEAD, and the option is primarily
meant to be used with the "--no-" prefix to override this default,
because there is no matching fetch.writeFetchHEAD configuration
variable to flip the default to off (in which case, the positive
form may become necessary to defeat it).
Note that under "--dry-run" mode, FETCH_HEAD is never written;
otherwise you'd see list of objects in the file that you do not
actually have. Passing `--write-fetch-head` does not force `git
fetch` to write the file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log --first-parent -p" showed patches only for single-parent
commits on the first-parent chain; the "--first-parent" option has
been made to imply "-m". Use "--no-diff-merges" to restore the
previous behaviour to omit patches for merge commits.
* jk/log-fp-implies-m:
doc/git-log: clarify handling of merge commit diffs
doc/git-log: move "-t" into diff-options list
doc/git-log: drop "-r" diff option
doc/git-log: move "Diff Formatting" from rev-list-options
log: enable "-m" automatically with "--first-parent"
revision: add "--no-diff-merges" option to counteract "-m"
log: drop "--cc implies -m" logic
"git bisect" learns the "--first-parent" option to find the first
breakage along the first-parent chain.
* al/bisect-first-parent:
bisect: combine args passed to find_bisection()
bisect: introduce first-parent flag
cmd_bisect__helper: defer parsing no-checkout flag
rev-list: allow bisect and first-parent flags
t6030: modernize "git bisect run" tests
In the ensure_core_worktree() function, we load the core.worktree value
of the submodule repository using repo_config_get_string(). This
function copies the string, but we never free it, leaking the memory.
We can instead use the "tmp" version of that function to avoid the
allocation at all. We don't have to worry about lifetime issues, since
we never even look at the value (we just want to know if it's set).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rebase is implemented with two different backends - 'apply' and
'merge' each of which support a different set of options. In
particular the apply backend supports a number of options implemented
by 'git am' that are not implemented in the merge backend. This means
that the available options are different depending on which backend is
used which is confusing. This patch adds support for the
--committer-date-is-author-date option to the merge backend. This
option uses the author date of the commit that is being rewritten as
the committer date when the new commit is created.
Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The implementation of --committer-date-is-author-date exports
GIT_COMMITTER_DATE to override the default committer date but does not
reset GIT_COMMITTER_DATE in the environment after creating the commit
so it is set in the environment of any hooks that get run. We're about
to add the same functionality to the sequencer and do not want to have
GIT_COMMITTER_DATE set when running hooks or exec commands so lets
update commit_tree_extended() to take an explicit committer so we
override the default date without setting GIT_COMMITTER_DATE in the
environment.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of functions that used struct refspec_item did not zero out the
structure memory. This can result in unexpected behavior, especially if
additional parameters are ever added to refspec_item in the future. Use
memset to ensure that unset structure members are zero.
It may make sense to convert most of these uses of struct refspec_item
to use either struct initializers or refspec_item_init_or_die. However,
other similar code uses memset. Converting all of these uses has been
left as a future exercise.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two functions to get a single config string:
- git_config_get_string()
- git_config_get_string_const()
One might naively think that the first one allocates a new string and
the second one just points us to the internal configset storage. But
in fact they both allocate a new copy; the second one exists only to
avoid having to cast when using it with a const global which we never
intend to free.
The documentation for the function explains that clearly, but it seems
I'm not alone in being surprised by this. Of 17 calls to the function,
13 of them leak the resulting value.
We could obviously fix these by adding the appropriate free(). But it
would be simpler still if we actually had a non-allocating way to get
the string. There's git_config_get_value() but that doesn't quite do
what we want. If the config key is present but is a boolean with no
value (e.g., "[foo]bar" in the file), then we'll get NULL (whereas the
string versions will print an error and die).
So let's introduce a new variant, git_config_get_string_tmp(), that
behaves as these callers expect. We need a new name because we have new
semantics but the same function signature (so even if we converted the
four remaining callers, topics in flight might be surprised). The "tmp"
is because this value should only be held onto for a short time. In
practice it's rare for us to clear and refresh the configset,
invalidating the pointer, but hopefully the "tmp" makes callers think
about the lifetime. In each of the converted cases here the value only
needs to last within the local function or its immediate caller.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We unconditionally write a branch name into a newly allocated buffer in
new_branch_info->path, via setup_branch_path(). We then check to see if
the branch exists; if not, we set that field to NULL, leaking the
memory. We should take care to free() it when doing so.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>