Instead of hardcoded 'origin', use the configured default remote
when fetching from submodules.
Ejected from 'seen' due to test flakyness.
cf. <2e62dc94-b821-4815-8dd2-f806580d2027@ramsayjones.plus.com>
* ng/submodule-default-remote:
submodule: fetch missing objects from default remote
The core.attributesfile is intended to be set per repository, but
were kept track of by a single global variable in-core, which has
been corrected by moving it to per-repository data structure.
* ob/core-attributesfile-in-repository:
environment: move "branch.autoSetupMerge" into `struct repo_config_values`
environment: stop using core.sparseCheckout globally
environment: stop storing `core.attributesFile` globally
When be76c21282 (fetch: ensure submodule objects fetched, 2018-12-06)
added support for fetching a missing submodule object by id, it
hardcoded the remote name as "origin" and deferred anything more
complicated for a later patch. Implement the NEEDSWORK item to remove
the hardcoded assumption by adding and using a submodule helper subcmd
'get-default-remote'. Fixing this lets 'git fetch --recurse-submodules'
succeed when the fetched commit(s) in the superproject trigger a
submodule fetch, and that submodule's default remote name is not
"origin".
Add non-"origin" remote tests to t5526-fetch-submodules.sh and
t5572-pull-submodule.sh demonstrating this works as expected and add
dedicated tests for get-default-remote.
Signed-off-by: Nasser Grainawi <nasser.grainawi@oss.qualcomm.com>
Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The config value `branch.autoSetupMerge` is parsed in
`git_default_branch_config()` and stored in the global variable
`git_branch_track`. This global variable can be overwritten
by another repository when multiple Git repos run in the the same process.
Move this value into `struct repo_config_values` in the_repository to
retain current behaviours and move towards libifying Git.
Since the variable is no longer a global variable, it has been renamed to
`branch_track` in the struct `repo_config_values`.
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid local submodule repository directory paths overlapping with
each other by encoding submodule names before using them as path
components.
* ar/submodule-gitdir-tweak:
submodule: detect conflicts with existing gitdir configs
submodule: hash the submodule name for the gitdir path
submodule: fix case-folding gitdir filesystem collisions
submodule--helper: fix filesystem collisions by encoding gitdir paths
builtin/credential-store: move is_rfc3986_unreserved to url.[ch]
submodule--helper: add gitdir migration command
submodule: allow runtime enabling extensions.submodulePathConfig
submodule: introduce extensions.submodulePathConfig
builtin/submodule--helper: add gitdir command
submodule: always validate gitdirs inside submodule_name_to_gitdir
submodule--helper: use submodule_name_to_gitdir in add_submodule
If none of the previous plain-text / encoding / derivation steps work
and case 2.4 is reached, then try a hash of the submodule name to see
if that can be a valid gitdir before giving up and throwing an error.
This is a "last resort" type of measure to avoid conflicts since it
loses the human readability of the gitdir path. This logic will be
reached in rare cases, as can be seen in the test we added.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new check when extension.submodulePathConfig is enabled, to
detect and prevent case-folding filesystem colisions. When this
new check is triggered, a stricter casefolding aware URI encoding
is used to percent-encode uppercase characters.
By using this check/retry mechanism the uppercase encoding is
only applied when necessary, so case-sensitive filesystems are
not affected.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix nested filesystem collisions by url-encoding gitdir paths stored
in submodule.%s.gitdir, when extensions.submodulePathConfig is enabled.
Credit goes to Junio and Patrick for coming up with this design: the
encoding is only applied when necessary, to newly added submodules.
Existing modules don't need the encoding because git already errors
out when detecting nested gitdirs before this patch.
This commit adds the basic url-encoding and some tests. Next commits
extend the encode -> validate -> retry loop to fix more conflicts.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Manually running
"git config submodule.<name>.gitdir .git/modules/<name>"
for each submodule can be impractical, so add a migration command to
submodule--helper to automatically create configs for all submodules
as required by extensions.submodulePathConfig.
The command calls create_default_gitdir_config() which validates the
gitdir paths before adding the configs.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea of this extension is to abstract away the submodule gitdir
path implementation: everyone is expected to use the config and not
worry about how the path is computed internally, either in git or
other implementations.
With this extension enabled, the submodule.<name>.gitdir repo config
becomes the single source of truth for all submodule gitdir paths.
The submodule.<name>.gitdir config is added automatically for all new
submodules when this extension is enabled.
Git will throw an error if the extension is enabled and a config is
missing, advising users how to migrate. Migration is manual for now.
E.g. to add a missing config entry for an existing "foo" module:
git config submodule.foo.gitdir .git/modules/foo
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This exposes the gitdir name computed by submodule_name_to_gitdir()
internally, to make it easier for users and tests to interact with it.
Next commit will add a gitdir configuration, so this helper can also be
used to easily query that config or validate any gitdir path the user
sets (submodule_name_to_git_dir now runs the validation logic, since
our previous commit).
Based-on-patch-by: Brandon Williams <bwilliams.eng@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the ad-hoc validation checks sprinkled across the source tree,
after calling submodule_name_to_gitdir() into the function proper,
which now always validates the gitdir before returning it.
This simplifies the API and helps to:
1. Avoid redundant validation calls after submodule_name_to_gitdir().
2. Avoid the risk of callers forgetting to validate.
3. Ensure gitdir paths provided by users via configs are always valid
(config gitdir paths are added in a subsequent commit).
The validation function can still be called as many times as needed
outside submodule_name_to_gitdir(), for example we keep two calls
which are still required, to avoid parallel clone races by re-running
the validation in builtin/submodule-helper.c.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While testing submodule gitdir path encoding, I noticed submodule--helper
is still using a hardcoded modules gitdir path leading to test failures.
Call the submodule_name_to_gitdir() helper instead, which was invented
exactly for this purpose and is already used by all the other locations
which work on gitdirs.
Also narrow the scope of the submod_gitdir_path variable which is not
used anymore in the updated "else" branch.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git submodule add" to add a submodule under <name> segfaulted,
when a submodule.<name>.something is already in .gitmodules file
without defining where its submodule.<name>.path is, which has been
corrected.
* jc/submodule-add:
submodule add: sanity check existing .gitmodules
"git submodule add" tries to find if a submodule with the same name
already exists at a different path, by looking up an entry in the
.gitmodules file. If the entry in the file is incomplete, e.g.,
when the submodule.<name>.something variable is defined but there is
no definition of submodule.<name>.path variable, it accesses the
missing .path member of the submodule structure and triggers a
segfault.
A brief audit was done to make sure that the code does not assume
members other than those that are absolutely certain to exist: a
submodule obtained by submodule_from_name() should have .name
member, while a submodule obtained by submodule_from_path() should
also have .path as well as .name member, and we cannot assume
anything else. Luckily, the module_add() codepath was the only
problematic one. It is fairly recent code that comes from 1fa06ced
(submodule: prevent overwriting .gitmodules on path reuse,
2025-07-24).
A helper used by update_submodule() seems to assume that its call to
submodule_from_path() always yields a submodule object without a
failure, which seems to rely on the caller making sure it is the
case. Leave an assert() with a NEEDSWORK comment there for future
developers to make sure the assumption actually holds.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `each_ref_fn` callback function type is used across our code base
for several different functions that iterate through reference. There's
a bunch of callbacks implementing this type, which makes any changes to
the callback signature extremely noisy. An example of the required churn
is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a
single argument required us to change 48 files.
It was already proposed back then [1] that we might want to introduce a
wrapper structure to alleviate the pain going forward. While this of
course requires the same kind of global refactoring as just introducing
a new parameter, it at least allows us to more change the callback type
afterwards by just extending the wrapper structure.
One counterargument to this refactoring is that it makes the structure
more opaque. While it is obvious which callsites need to be fixed up
when we change the function type, it's not obvious anymore once we use
a structure. That being said, we only have a handful of sites that
actually need to populate this wrapper structure: our ref backends,
"refs/iterator.c" as well as very few sites that invoke the iterator
callback functions directly.
Introduce this wrapper structure so that we can adapt the iterator
interfaces more readily.
[1]: <ZmarVcF5JjsZx0dl@tanuki>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The setup_revisions() function was designed to take the argc/argv pair
from the operating system. But we sometimes construct our own argv using
a strvec and pass that in. There are a few gotchas that callers need to
deal with here:
1. You should always pass the free_removed_argv_elements option via
setup_revision_opt. Otherwise, entries may be leaked if
setup_revisions() re-shuffles options.
2. After setup_revisions() returns, the strvec state is odd. We get a
reduced argc from setup_revisions() telling us how many unknown
options were left in place. Entries after that in argv may be
retained, or may be NULL (depending on how the reshuffling
happened). But the strvec's "nr" field still represents the
original value, and some of the entries it thinks it is still
storing may be NULL. Callers must be careful with how they access
it.
Some callers deal with (1), but not all. In practice they are OK because
they do not pass any options that would cause setup_revisions() to
re-shuffle (namely unknown options which may be relayed from the user,
and the use of the "--" separator). But it's probably a good idea to
consistently pass this option anyway to future-proof ourselves against
the details of setup_revisions() changing.
No callers address (2), though I don't think there any visible bugs.
Most of them simply call strvec_clear() and never otherwise look at the
result. And in fact, if they naively set foo.nr to the argc returned by
setup_revisions(), that would cause leaks! Because setup_revisions()
does not free consumed options[1], we have to leave the "nr" field of
the strvec at its original value to find and free them during
strvec_clear().
So I don't think there are any bugs to fix here, but we can make things
safer and simpler for callers. Let's introduce a helper function that
sets the free_removed_argv_elements automatically and shrinks the strvec
to represent the retained options afterwards (taking care to free the
now-obsolete entries).
We'll start by converting all of the call-sites which use the
free_removed_argv_elements option. There should be no behavior change
for them, except that their "shrunken" entries are cleaned up
immediately, rather than waiting for a strvec_clear() call.
[1] Arguably setup_revisions() should be doing this step for us if we
told it to free removed options, but there are many existing callers
which will be broken if it did. Introducing this helper is a
possible first step towards that.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The case where a new submodule takes a path where used to be a
completely different subproject is now dealt a bit better than
before.
* kj/renamed-submodule:
fixup! submodule: skip redundant active entries when pattern covers path
fixup! submodule: prevent overwriting .gitmodules on path reuse
submodule: skip redundant active entries when pattern covers path
submodule: prevent overwriting .gitmodules on path reuse
configure_added_submodule always writes an explicit
submodule.<name>.active entry, even when the new
path is already matched by submodule.active
patterns. This leads to unnecessary and cluttered configuration.
change the logic to centralize wildmatch-based pattern lookup,
in configure_added_submodule. Wrap the active-entry write in a conditional
that only fires when that helper reports no existing pattern covers the
submodule’s path.
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adding a submodule at a path that previously hosted
another submodule (e.g., 'child') reuses the submodule
name derived from the path. If the original submodule
was only moved (e.g., to 'child_old') and not renamed,
this silently overwrites its configuration in .gitmodules.
This behavior loses user configuration and causes
confusion when the original submodule is expected
to remain intact. It assumes that the path-derived
name is always safe to reuse, even though the name
might still be in use elsewhere in the repository.
Teach module_add() to check if the computed submodule
name already exists in the repository's submodule config,
and if so, refuse the operation unless the user explicitly
renames the submodule or uses the --force option,
which will automatically generate a unique name by
appending a number (e.g., child1).
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 036876a106 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove
`git_config_set_in_file_gently()`. All callsites are adjusted so that
they use `repo_config_set_in_file_gently(the_repository, ...)` instead.
While some callsites might already have a repository available, this
mechanical conversion is the exact same as the current situation and
thus cannot cause any regression. Those sites should eventually be
cleaned up in a later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 036876a106 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_set_gently()`.
All callsites are adjusted so that they use
`repo_config_set_gently(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 036876a106 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_set_in_file()`.
All callsites are adjusted so that they use
`repo_config_set_in_file(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 036876a106 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_string()`.
All callsites are adjusted so that they use
`repo_config_get_string(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 036876a106 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_string()`.
All callsites are adjusted so that they use
`repo_config_get_string(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 036876a106 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get()`. All
callsites are adjusted so that they use `repo_config_get(the_repository,
...)` instead. While some callsites might already have a repository
available, this mechanical conversion is the exact same as the current
situation and thus cannot cause any regression. Those sites should
eventually be cleaned up in a later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 036876a106 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config()`. All callsites
are adjusted so that they use `repo_config(the_repository, ...)`
instead. While some callsites might already have a repository available,
this mechanical conversion is the exact same as the current situation
and thus cannot cause any regression. Those sites should eventually be
cleaned up in a later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up around object access API.
* ps/object-store:
odb: rename `read_object_with_reference()`
odb: rename `pretend_object_file()`
odb: rename `has_object()`
odb: rename `repo_read_object_file()`
odb: rename `oid_object_info()`
odb: trivial refactorings to get rid of `the_repository`
odb: get rid of `the_repository` when handling submodule sources
odb: get rid of `the_repository` when handling the primary source
odb: get rid of `the_repository` in `for_each()` functions
odb: get rid of `the_repository` when handling alternates
odb: get rid of `the_repository` in `odb_mkstemp()`
odb: get rid of `the_repository` in `assert_oid_type()`
odb: get rid of `the_repository` in `find_odb()`
odb: introduce parent pointers
object-store: rename files to "odb.{c,h}"
object-store: rename `object_directory` to `odb_source`
object-store: rename `raw_object_store` to `object_database`
Updating submodules from the upstream did not work well when
submodule's HEAD is detached, which has been improved.
* jk/submodule-remote-lookup-cleanup:
submodule: look up remotes by URL first
submodule: move get_default_remote_submodule()
submodule--helper: improve logic for fallback remote name
remote: remove the_repository from some functions
dir: move starts_with_dot(_dot)_slash to dir.h
remote: fix tear down of struct remote
remote: remove branch->merge_name and fix branch_release()
There are a couple of iterator-style functions that execute a callback
for each instance of a given set, all of which currently depend on
`the_repository`. Refactor them to instead take an object database as
parameter so that we can get rid of this dependency.
Rename the functions accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commits we have renamed the structures contained in
"object-store.h" to `struct object_database` and `struct odb_backend`.
As such, the code files "object-store.{c,h}" are confusingly named now.
Rename them to "odb.{c,h}" accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `object_directory` structure is used as an access point for a single
object directory like ".git/objects". While the structure isn't yet
fully self-contained, the intent is for it to eventually contain all
information required to access objects in one specific location.
While the name "object directory" is a good fit for now, this will
change over time as we continue with the agenda to make pluggable object
databases a thing. Eventually, objects may not be accessed via any kind
of directory at all anymore, but they could instead be backed by any
kind of durable storage mechanism. While it seems quite far-fetched for
now, it is thinkable that eventually this might even be some form of a
database, for example.
As such, the current name of this structure will become worse over time
as we evolve into the direction of pluggable ODBs. Immediate next steps
will start to carve out proper self-contained object directories, which
requires us to pass in these object directories as parameters. Based on
our modern naming schema this means that those functions should then be
named after their subsystem, which means that we would start to bake the
current name into the codebase more and more.
Let's preempt this by renaming the structure. There have been a couple
alternatives that were discussed:
- `odb_backend` was discarded because it led to the association that
one object database has a single backend, but the model is that one
alternate has one backend. Furthermore, "backend" is more about the
actual backing implementation and less about the high-level concept.
- `odb_alternate` was discarded because it is a bit of a stretch to
also call the main object directory an "alternate".
Instead, pick `odb_source` as the new name. It makes it sufficiently
clear that there can be multiple sources and does not cause confusion
when mixed with the already-existing "alternate" terminology.
In the future, this change allows us to easily introduce for example a
`odb_files_source` and other format-specific implementations.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git maintenance" lacked the care "git gc" had to avoid holding
onto the repository lock for too long during packing refs, which
has been remedied.
* ps/maintenance-ref-lock:
builtin/maintenance: fix locking race when handling "gc" task
builtin/gc: avoid global state in `gc_before_repack()`
usage: allow dying without writing an error message
builtin/maintenance: fix locking race with refs and reflogs tasks
builtin/maintenance: split into foreground and background tasks
builtin/maintenance: fix typedef for function pointers
builtin/maintenance: extract function to run tasks
builtin/maintenance: stop modifying global array of tasks
builtin/maintenance: mark "--task=" and "--schedule=" as incompatible
builtin/maintenance: centralize configuration of explicit tasks
builtin/gc: drop redundant local variable
builtin/gc: use designated field initializers for maintenance tasks
The get_default_remote_submodule() function performs a lookup to find
the appropriate remote to use within a submodule. The function first
checks to see if it can find the remote for the current branch. If this
fails, it then checks to see if there is exactly one remote. It will use
this, before finally falling back to "origin" as the default.
If a user happens to rename their default remote from origin, either
manually or by setting something like clone.defaultRemoteName, this
fallback will not work.
In such cases, the submodule logic will try to use a non-existent
remote. This usually manifests as a failure to trigger the submodule
update.
The parent project already knows and stores the submodule URL in either
.gitmodules or its .git/config.
Add a new repo_remote_from_url() helper which will iterate over all the
remotes in a repository and return the first remote which has a matching
URL.
Refactor get_default_remote_submodule to find the submodule and get its
URL. If a valid URL exists, first try to obtain a remote using the new
repo_remote_from_url(). Fall back to the repo_default_remote()
otherwise.
The fallback logic is kept in case for some reason the user has manually
changed the URL within the submodule. Additionally, we still try to use
a remote rather than directly passing the URL in the
fetch_in_submodule() logic. This ensures that an update will properly
update the remote refs within the submodule as expected, rather than
just fetching into FETCH_HEAD.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future refactor got get_default_remote_submodule() is going to depend on
resolve_relative_url(). That function depends on get_default_remote().
Move get_default_remote_submodule() after resolve_relative_url() first
to make the additional functionality easier to review.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repo_get_default_remote() function in submodule--helper currently
tries to figure out the proper remote name to use for a submodule based
on a few factors.
First, it tries to find the remote for the currently checked out branch.
This works if the submodule is configured to checkout to a branch
instead of a detached HEAD state.
In the detached HEAD state, the code calls back to using "origin", on
the assumption that this is the default remote name. Some users may
change this, such as by setting clone.defaultRemoteName, or by changing
the remote name manually within the submodule repository.
As a first step to improving this situation, refactor to reuse the logic
from remotes_remote_for_branch(). This function uses the remote from the
branch if it has one. If it doesn't then it checks to see if there is
exactly one remote. It uses this remote first before attempting to fall
back to "origin".
To allow using this helper function, introduce a repo_default_remote()
helper to remote.c which takes a repository structure. This helper will
load the remote configuration and get the "HEAD" branch. Then it will
call remotes_remote_for_branch to find the default remote.
Replace calls of repo_get_default_remote() with the calls to this new
function. To maintain consistency with the existing callers, continue
copying the returned string with xstrdup.
This isn't a perfect solution for users who change remote names, but it
should help in cases where the remote name is changed but users haven't
added any additional remotes.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both submodule--helper.c and submodule-config.c have an implementation
of starts_with_dot_slash and starts_with_dot_dot_slash. The dir.h header
has starts_with_dot(_dot)_slash_native, which sets PATH_MATCH_NATIVE.
Move the helpers to dir.h as static inlines. I thought about renaming
them to postfix with _platform but that felt too long and ugly. On the
other hand it might be slightly confusing with _native.
This simplifies a submodule refactor which wants to use the helpers
earlier in the submodule--helper.c file.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A memory leak on an error code path has been plugged.
* ly/submodule-update-failure-leakfix:
builtin/submodule--helper: fix leak when remote_submodule_branch() failed
In builtin/submodule--helper.c:update_submodule(), the variable
remote_name is allocated in get_default_remote_submodule() but
may be leaked if remote_submodule_branch() fails. Although it is
unlikely that remote_submodule_branch() would fail after successfully
obtaining a remote ref name from get_default_remote_submodule(),
it is still possible. To prevent a potential memory leak, add a
call to free(remote_name) at the early exit point.
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes code wants to die in a situation where it already has written
an error message. To use the same error code as `die()` we have to use
`exit(128)`, which is easy to get wrong and leaves magic numbers all
over our codebase.
Teach `die_message_builtin()` to not print any error when passed a
`NULL` pointer as error string. Like this, such users can now call
`die(NULL)` to achieve the same result without any hardcoded error
codes.
Adapt a couple of builtins to use this new pattern to demonstrate that
there is a need for such a helper.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "object-store-ll.h" header has been introduced to keep transitive
header dependendcies and compile times at bay. Now that we have created
a new "object-store.c" file though we can easily move the last remaining
additional bit of "object-store.h", the `odb_path_map`, out of the
header.
Do so. As the "object-store.h" header is now equivalent to its low-level
alternative we drop the latter and inline it into the former.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `safe_create_leading_directories()` function and its relatives are
located in "object-file.c", which is not a good fit as they provide
generic functionality not related to objects at all. Move them into
"path.c", which already hosts `safe_create_dir()` and its relative
`safe_create_dir_in_gitdir()`.
"path.c" is free of `the_repository`, but the moved functions depend on
`the_repository` to read the "core.sharedRepository" config. Adapt the
function signature to accept a repository as argument to fix the issue
and adjust callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `null_oid()` function returns the object ID that only consists of
zeroes. Naturally, this ID also depends on the hash algorithm used, as
the number of zeroes is different between SHA1 and SHA256. Consequently,
the function returns the hash-algorithm-specific null object ID.
This is currently done by depending on `the_hash_algo`, which implicitly
makes us depend on `the_repository`. Refactor the function to instead
pass in the hash algorithm for which we want to retrieve the null object
ID. Adapt callsites accordingly by passing in `the_repository`, thus
bubbling up the dependency on that global variable by one layer.
There are a couple of trivial exceptions for subsystems that already got
rid of `the_repository`. These subsystems instead use the repository
that is available via the calling context:
- "builtin/grep.c"
- "grep.c"
- "refs/debug.c"
There are also two non-trivial exceptions:
- "diff-no-index.c": Here we know that we may not have a repository
initialized at all, so we cannot rely on `the_repository`. Instead,
we adapt `diff_no_index()` to get a `struct git_hash_algo` as
parameter. The only caller is located in "builtin/diff.c", where we
know to call `repo_set_hash_algo()` in case we're running outside of
a Git repository. Consequently, it is fine to continue passing
`the_repository->hash_algo` even in this case.
- "builtin/ls-files.c": There is an in-flight patch series that drops
`USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic
conflict because we use `null_oid()` in `show_submodule()`. The
value is passed to `repo_submodule_init()`, which may use the object
ID to resolve a tree-ish in the superproject from which we want to
read the submodule config. As such, the object ID should refer to an
object in the superproject, and consequently we need to use its hash
algorithm.
This means that we could in theory just not bother about this edge
case at all and just use `the_repository` in "diff-no-index.c". But
doing so would feel misdesigned.
Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in
"hash.c".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As explained in an earlier commit, we're refactoring path-related
functions to provide a consistent interface for computing paths into the
commondir, gitdir and worktree. Refactor the "submodule" family of
functions accordingly.
Note that in contrast to the other `repo_*_path()` families, we have to
pass in the repository as a non-constant pointer. This is because we end
up calling `repo_read_gitmodules()` deep down in the callstack, which
may end up modifying the repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `submodule_to_gitdir()` function implicitly uses `the_repository` to
resolve submodule paths. Refactor the function to instead accept a repo
as parameter to remove the dependency on global state.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a bunch of loops which iterate up to an unsigned boundary using
a signed index, which generates warnigs because we compare a signed and
unsigned value in the loop condition. Address these sites for trivial
cases and enable `-Wsign-compare` warnings for these code units.
This patch only adapts those code units where we can drop the
`DISABLE_SIGN_COMPARE_WARNINGS` macro in the same step.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 9b1cb5070f (builtin: add a repository parameter for builtin
functions, 2024-09-13) the repository was passed down to all builtin
commands. This allowed the repository to be passed down to lower layers
without depending on the global `the_repository` variable.
Continue this work by also passing down the repository parameter from
the command to sub-commands. This will help pass down the repository to
other subsystems and cleanup usage of global variables like
'the_repository' and 'the_hash_algo'.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>