fsmonitor: use pthread_cond_timedwait for cookie wait

The cookie wait in with_lock__wait_for_cookie() uses an infinite
pthread_cond_wait() loop.  The existing comment notes the desire
to switch to pthread_cond_timedwait(), but the routine was not
available in git thread-utils.

On certain container or overlay filesystems, inotify watches may
succeed but events are never delivered.  In this case the daemon
would hang indefinitely waiting for the cookie event, which in
turn causes the client to hang.

Replace the infinite wait with a one-second timeout using
pthread_cond_timedwait().  If the timeout fires, report an
error and let the client proceed with a trivial (full-scan)
response rather than blocking forever.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Paul Tarjan
2026-02-26 00:27:17 +00:00
committed by Junio C Hamano
parent 945f158bc7
commit 25351303cd

View File

@@ -197,20 +197,31 @@ static enum fsmonitor_cookie_item_result with_lock__wait_for_cookie(
unlink(cookie_pathname.buf);
/*
* Technically, this is an infinite wait (well, unless another
* thread sends us an abort). I'd like to change this to
* use `pthread_cond_timedwait()` and return an error/timeout
* and let the caller do the trivial response thing, but we
* don't have that routine in our thread-utils.
*
* After extensive beta testing I'm not really worried about
* this. Also note that the above open() and unlink() calls
* will cause at least two FS events on that path, so the odds
* of getting stuck are pretty slim.
* Wait for the listener thread to observe the cookie file.
* Time out after a short interval so that the client
* does not hang forever if the filesystem does not deliver
* events (e.g., on certain container/overlay filesystems
* where inotify watches succeed but events never arrive).
*/
while (cookie->result == FCIR_INIT)
pthread_cond_wait(&state->cookies_cond,
&state->main_lock);
{
struct timeval now;
struct timespec ts;
int err = 0;
gettimeofday(&now, NULL);
ts.tv_sec = now.tv_sec + 1;
ts.tv_nsec = now.tv_usec * 1000;
while (cookie->result == FCIR_INIT && !err)
err = pthread_cond_timedwait(&state->cookies_cond,
&state->main_lock,
&ts);
if (err == ETIMEDOUT && cookie->result == FCIR_INIT) {
trace_printf_key(&trace_fsmonitor,
"cookie_wait timed out");
cookie->result = FCIR_ERROR;
}
}
done:
hashmap_remove(&state->cookies, &cookie->entry, NULL);