bloom: replace struct bloom_key * with struct bloom_keyvec

Previously, we stored bloom keys in a flat array and marked a commit
as NOT TREESAME if any key reported "definitely not changed".

To support multiple pathspec items, we now require that for each
pathspec item, there exists a bloom key reporting "definitely not
changed".

This "for every" condition makes a flat array insufficient, so we
introduce a new structure to group keys by a single pathspec item.
`struct bloom_keyvec` is introduced to replace `struct bloom_key *`
and `bloom_key_nr`. And because we want to support multiple pathspec
items, we added a bloom_keyvec * and a bloom_keyvec_nr field to
`struct rev_info` to represent an array of bloom_keyvecs. This commit
still optimize only one pathspec item, thus bloom_keyvec_nr can only
be 0 or 1.

New bloom_keyvec_* functions are added to create and destroy a keyvec.
bloom_filter_contains_vec() is added to check if all key in keyvec is
contained in a bloom filter.

Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Lidong Yan
2025-07-12 17:35:15 +08:00
committed by Junio C Hamano
parent b187353ed2
commit 90d5518a7d
4 changed files with 132 additions and 49 deletions

View File

@@ -62,7 +62,7 @@ struct repository;
struct rev_info;
struct string_list;
struct saved_parents;
struct bloom_key;
struct bloom_keyvec;
struct bloom_filter_settings;
struct option;
struct parse_opt_ctx_t;
@@ -360,8 +360,8 @@ struct rev_info {
/* Commit graph bloom filter fields */
/* The bloom filter key(s) for the pathspec */
struct bloom_key *bloom_keys;
int bloom_keys_nr;
struct bloom_keyvec **bloom_keyvecs;
int bloom_keyvecs_nr;
/*
* The bloom filter settings used to generate the key.