-
Notifications
You must be signed in to change notification settings - Fork 26.4k
pack-bitmap: fix memory leak if load_bitmap
failed
#1962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,7 @@ struct stored_bitmap { | |
struct object_id oid; | ||
struct ewah_bitmap *root; | ||
struct stored_bitmap *xor; | ||
size_t map_pos; | ||
int flags; | ||
}; | ||
|
||
|
@@ -314,13 +315,14 @@ static struct stored_bitmap *store_bitmap(struct bitmap_index *index, | |
struct ewah_bitmap *root, | ||
const struct object_id *oid, | ||
struct stored_bitmap *xor_with, | ||
int flags) | ||
int flags, size_t map_pos) | ||
{ | ||
struct stored_bitmap *stored; | ||
khiter_t hash_pos; | ||
int ret; | ||
|
||
stored = xmalloc(sizeof(struct stored_bitmap)); | ||
stored->map_pos = map_pos; | ||
stored->root = root; | ||
stored->xor = xor_with; | ||
stored->flags = flags; | ||
|
@@ -376,10 +378,12 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) | |
struct stored_bitmap *xor_bitmap = NULL; | ||
uint32_t commit_idx_pos; | ||
struct object_id oid; | ||
size_t entry_map_pos; | ||
|
||
if (index->map_size - index->map_pos < 6) | ||
return error(_("corrupt ewah bitmap: truncated header for entry %d"), i); | ||
|
||
entry_map_pos = index->map_pos; | ||
commit_idx_pos = read_be32(index->map, &index->map_pos); | ||
xor_offset = read_u8(index->map, &index->map_pos); | ||
flags = read_u8(index->map, &index->map_pos); | ||
|
@@ -402,8 +406,9 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) | |
if (!bitmap) | ||
return -1; | ||
|
||
recent_bitmaps[i % MAX_XOR_OFFSET] = store_bitmap( | ||
index, bitmap, &oid, xor_bitmap, flags); | ||
recent_bitmaps[i % MAX_XOR_OFFSET] = | ||
store_bitmap(index, bitmap, &oid, xor_bitmap, flags, | ||
entry_map_pos); | ||
} | ||
|
||
return 0; | ||
|
@@ -630,41 +635,28 @@ static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git, | |
bitmap_git->ext_index.positions = kh_init_oid_pos(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): "Taylor Blau via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Taylor Blau <me@ttaylorr.com>
>
> After going through the "failed" label, load_bitmap() will return -1,
> and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
> will then call free_bitmap_index().
> ...
> The solution is to remove the error handling code in load_bitmap(), because
> its caller will always call free_bitmap_index() in case of an error.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
As this is Lidong relaying <aCOFqYdnPp1Lne4Y@nand.local> that Taylor
sent to the list, shouldn't Lidong's sign-off be after Taylor's? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Taylor Blau wrote (reply to this): On Thu, May 29, 2025 at 08:33:29AM -0700, Junio C Hamano wrote:
> "Taylor Blau via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Taylor Blau <me@ttaylorr.com>
> >
> > After going through the "failed" label, load_bitmap() will return -1,
> > and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
> > will then call free_bitmap_index().
> > ...
> > The solution is to remove the error handling code in load_bitmap(), because
> > its caller will always call free_bitmap_index() in case of an error.
> >
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
> > ---
>
> As this is Lidong relaying <aCOFqYdnPp1Lne4Y@nand.local> that Taylor
> sent to the list, shouldn't Lidong's sign-off be after Taylor's?
I've always assumed the answer here was "yes", but I don't know that our
documentation suggests the same.
In c11c3b5681 (Documentation/SubmittingPatches: What's Acked-by and
Tested-by?, 2008-02-03) you added:
Notice that you can place your own Signed-off-by: line when
forwarding somebody else's patch [...]. Indeed you are encouraged
to do so. [...]
and that text survives into the current version of SubmittingPatches.
So I think that while our documentation encourages people to add their
own S-o-b to others' patches sent on their behalf, it doesn't
explicitly require it.
Thanks,
Taylor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): Taylor Blau <me@ttaylorr.com> writes:
> In c11c3b5681 (Documentation/SubmittingPatches: What's Acked-by and
> Tested-by?, 2008-02-03) you added:
>
> Notice that you can place your own Signed-off-by: line when
> forwarding somebody else's patch [...]. Indeed you are encouraged
> to do so. [...]
>
> and that text survives into the current version of SubmittingPatches.
> So I think that while our documentation encourages people to add their
> own S-o-b to others' patches sent on their behalf, it doesn't
> explicitly require it.
It would not hurt that much if I pretended that I ignored what
Lidong forwarded and picked up your patch directly from the list,
only because what was forwarded was public. If it were a privately
shared patch, sign-off is required, so we'd need to tighten the
language in the SubmittingPatches document, I think.
Thanks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, lidongyan wrote (reply to this): Got it, I will add my sign-off.
> 2025年5月29日 23:33,Junio C Hamano <gitster@pobox.com> 写道:
>
> "Taylor Blau via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Taylor Blau <me@ttaylorr.com>
>>
>> After going through the "failed" label, load_bitmap() will return -1,
>> and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
>> will then call free_bitmap_index().
>> ...
>> The solution is to remove the error handling code in load_bitmap(), because
>> its caller will always call free_bitmap_index() in case of an error.
>>
>> Signed-off-by: Taylor Blau <me@ttaylorr.com>
>> ---
>
> As this is Lidong relaying <aCOFqYdnPp1Lne4Y@nand.local> that Taylor
> sent to the list, shouldn't Lidong's sign-off be after Taylor's?
>
|
||
|
||
if (load_reverse_index(r, bitmap_git)) | ||
goto failed; | ||
return -1; | ||
|
||
if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) || | ||
!(bitmap_git->trees = read_bitmap_1(bitmap_git)) || | ||
!(bitmap_git->blobs = read_bitmap_1(bitmap_git)) || | ||
!(bitmap_git->tags = read_bitmap_1(bitmap_git))) | ||
goto failed; | ||
return -1; | ||
|
||
if (!bitmap_git->table_lookup && load_bitmap_entries_v1(bitmap_git) < 0) | ||
goto failed; | ||
return -1; | ||
|
||
if (bitmap_git->base) { | ||
if (!bitmap_is_midx(bitmap_git)) | ||
BUG("non-MIDX bitmap has non-NULL base bitmap index"); | ||
if (load_bitmap(r, bitmap_git->base, 1) < 0) | ||
goto failed; | ||
return -1; | ||
} | ||
|
||
if (!recursing) | ||
load_all_type_bitmaps(bitmap_git); | ||
|
||
return 0; | ||
|
||
failed: | ||
munmap(bitmap_git->map, bitmap_git->map_size); | ||
bitmap_git->map = NULL; | ||
bitmap_git->map_size = 0; | ||
|
||
kh_destroy_oid_map(bitmap_git->bitmaps); | ||
bitmap_git->bitmaps = NULL; | ||
|
||
kh_destroy_oid_pos(bitmap_git->ext_index.positions); | ||
bitmap_git->ext_index.positions = NULL; | ||
|
||
return -1; | ||
} | ||
|
||
static int open_pack_bitmap(struct repository *r, | ||
|
@@ -882,6 +874,7 @@ static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_ | |
int xor_flags; | ||
khiter_t hash_pos; | ||
struct bitmap_lookup_table_xor_item *xor_item; | ||
size_t entry_map_pos; | ||
|
||
if (is_corrupt) | ||
return NULL; | ||
|
@@ -941,14 +934,16 @@ static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_ | |
goto corrupt; | ||
} | ||
|
||
entry_map_pos = bitmap_git->map_pos; | ||
bitmap_git->map_pos += sizeof(uint32_t) + sizeof(uint8_t); | ||
xor_flags = read_u8(bitmap_git->map, &bitmap_git->map_pos); | ||
bitmap = read_bitmap_1(bitmap_git); | ||
|
||
if (!bitmap) | ||
goto corrupt; | ||
|
||
xor_bitmap = store_bitmap(bitmap_git, bitmap, &xor_item->oid, xor_bitmap, xor_flags); | ||
xor_bitmap = store_bitmap(bitmap_git, bitmap, &xor_item->oid, | ||
xor_bitmap, xor_flags, entry_map_pos); | ||
xor_items_nr--; | ||
} | ||
|
||
|
@@ -982,14 +977,16 @@ static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_ | |
* Instead, we can skip ahead and immediately read the flags and | ||
* ewah bitmap. | ||
*/ | ||
entry_map_pos = bitmap_git->map_pos; | ||
bitmap_git->map_pos += sizeof(uint32_t) + sizeof(uint8_t); | ||
flags = read_u8(bitmap_git->map, &bitmap_git->map_pos); | ||
bitmap = read_bitmap_1(bitmap_git); | ||
|
||
if (!bitmap) | ||
goto corrupt; | ||
|
||
return store_bitmap(bitmap_git, bitmap, oid, xor_bitmap, flags); | ||
return store_bitmap(bitmap_git, bitmap, oid, xor_bitmap, flags, | ||
entry_map_pos); | ||
|
||
corrupt: | ||
free(xor_items); | ||
|
@@ -2852,8 +2849,9 @@ int test_bitmap_commits(struct repository *r) | |
die(_("failed to load bitmap indexes")); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Taylor Blau wrote (reply to this): On Tue, Jun 03, 2025 at 03:14:03AM +0000, Lidong Yan via GitGitGadget wrote:
> From: Lidong Yan <502024330056@smail.nju.edu.cn>
>
> In pack-bitmap.c:test_bitmap_commits(), it comments
>
> /*
> * As this function is only used to print bitmap selected
> * commits, we don't have to read the commit table.
> */
>
There is no need to include the original comment here, since it is clear
from the patch below what you're referring to.
I don't think this alone is worth rerolling the series, but others may
feel differently.
> This suggests that we can avoid reading the commit table altogether.
> However, this comment is misleading. The reason we load bitmap entries here
> is because test_bitmap_commits() needs to print the commit IDs from the
> bitmap, and we must read the bitmap entries to obtain those commit IDs.
> So reword this comment.
>
> Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
> ---
> pack-bitmap.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index fd19c2255163..e514c9da239b 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -2839,8 +2839,9 @@ int test_bitmap_commits(struct repository *r)
> die(_("failed to load bitmap indexes"));
>
> /*
> - * As this function is only used to print bitmap selected
> - * commits, we don't have to read the commit table.
> + * Since this function needs to print bitmap selected
The phrase "bitmap selected commits" is a little awkward. I might have
written either "the bitmapped commits", or "the set of commits which
have bitmaps".
> + * commits, bypass the commit lookup table (if one exists)
> + * by forcing the bitmap to eagerly load its entries.
> */
> if (bitmap_git->table_lookup) {
> if (load_bitmap_entries_v1(bitmap_git) < 0)
> --
> gitgitgadget
>
Thanks,
Taylor |
||
|
||
/* | ||
* As this function is only used to print bitmap selected | ||
* commits, we don't have to read the commit table. | ||
* Since this function needs to print the bitmapped | ||
* commits, bypass the commit lookup table (if one exists) | ||
* by forcing the bitmap to eagerly load its entries. | ||
*/ | ||
if (bitmap_git->table_lookup) { | ||
if (load_bitmap_entries_v1(bitmap_git) < 0) | ||
|
@@ -2869,6 +2867,48 @@ int test_bitmap_commits(struct repository *r) | |
return 0; | ||
} | ||
|
||
int test_bitmap_commits_with_offset(struct repository *r) | ||
{ | ||
struct object_id oid; | ||
struct stored_bitmap *stored; | ||
struct bitmap_index *bitmap_git; | ||
size_t commit_idx_pos_map_pos, xor_offset_map_pos, flag_map_pos, | ||
ewah_bitmap_map_pos; | ||
|
||
bitmap_git = prepare_bitmap_git(r); | ||
if (!bitmap_git) | ||
die(_("failed to load bitmap indexes")); | ||
|
||
/* | ||
* Since this function needs to know the position of each individual | ||
* bitmap, bypass the commit lookup table (if one exists) by forcing | ||
* the bitmap to eagerly load its entries. | ||
*/ | ||
if (bitmap_git->table_lookup) { | ||
if (load_bitmap_entries_v1(bitmap_git) < 0) | ||
die(_("failed to load bitmap indexes")); | ||
} | ||
|
||
kh_foreach (bitmap_git->bitmaps, oid, stored, { | ||
commit_idx_pos_map_pos = stored->map_pos; | ||
xor_offset_map_pos = stored->map_pos + sizeof(uint32_t); | ||
flag_map_pos = xor_offset_map_pos + sizeof(uint8_t); | ||
ewah_bitmap_map_pos = flag_map_pos + sizeof(uint8_t); | ||
|
||
printf_ln("%s %"PRIuMAX" %"PRIuMAX" %"PRIuMAX" %"PRIuMAX, | ||
oid_to_hex(&oid), | ||
(uintmax_t)commit_idx_pos_map_pos, | ||
(uintmax_t)xor_offset_map_pos, | ||
(uintmax_t)flag_map_pos, | ||
(uintmax_t)ewah_bitmap_map_pos); | ||
}) | ||
; | ||
|
||
free_bitmap_index(bitmap_git); | ||
|
||
return 0; | ||
} | ||
|
||
int test_bitmap_hashes(struct repository *r) | ||
{ | ||
struct bitmap_index *bitmap_git = prepare_bitmap_git(r); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Taylor Blau wrote (reply to this):
On Tue, May 20, 2025 at 09:23:09AM +0000, Taylor Blau via GitGitGadget wrote: > Signed-off-by: Taylor Blau <me@ttaylorr.com> This commit forges my Signed-off-by, but I am happy with the result here. I do think the series is structured a little awkwardly as a result of adding this patch to it. That this and the previous patch have the subject "pack-bitmap: fix memory leak if `load_bitmap_entries_v1` failed" make the series not quite as clear as it could be. I think there are a couple of things going on: - This patch is a bug fix that could be applied independently of the first one. The rationale there would be that we shouldn't be leaking the EWAH bitmaps in 'b->bitmaps', but we are as a result of NULL'ing the pointer in the "failed" label. That patch can stand alone. - The first patch (yours) is no longer fixing a leak, at least after this patch. But it does delay reading the bitmap until we have validated its XOR offset for sanity, which is a good thing mostly from a performance perspective. I would probably swap the two patches around so that yours applies on top of mine, and then rewords the patch message in yours to reflect that it is no longer fixing a leak. That all said, if you feel strongly that the structure is fine/better as-is, I'd be more than happy to discuss it further. Thanks, Taylor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, lidongyan wrote (reply to this):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):