+
Skip to content

pack-bitmap: fix memory leak if load_bitmap failed #1962

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 64 additions & 24 deletions pack-bitmap.c
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ struct stored_bitmap {
struct object_id oid;
struct ewah_bitmap *root;
struct stored_bitmap *xor;
size_t map_pos;
int flags;
};

Expand Down Expand Up @@ -314,13 +315,14 @@ static struct stored_bitmap *store_bitmap(struct bitmap_index *index,
struct ewah_bitmap *root,
const struct object_id *oid,
struct stored_bitmap *xor_with,
int flags)
int flags, size_t map_pos)
{
struct stored_bitmap *stored;
khiter_t hash_pos;
int ret;

stored = xmalloc(sizeof(struct stored_bitmap));
stored->map_pos = map_pos;
stored->root = root;
stored->xor = xor_with;
stored->flags = flags;
Expand Down Expand Up @@ -376,10 +378,12 @@ static int load_bitmap_entries_v1(struct bitmap_index *index)
struct stored_bitmap *xor_bitmap = NULL;
uint32_t commit_idx_pos;
struct object_id oid;
size_t entry_map_pos;

if (index->map_size - index->map_pos < 6)
return error(_("corrupt ewah bitmap: truncated header for entry %d"), i);

entry_map_pos = index->map_pos;
commit_idx_pos = read_be32(index->map, &index->map_pos);
xor_offset = read_u8(index->map, &index->map_pos);
flags = read_u8(index->map, &index->map_pos);
Expand All @@ -402,8 +406,9 @@ static int load_bitmap_entries_v1(struct bitmap_index *index)
if (!bitmap)
return -1;

recent_bitmaps[i % MAX_XOR_OFFSET] = store_bitmap(
index, bitmap, &oid, xor_bitmap, flags);
recent_bitmaps[i % MAX_XOR_OFFSET] =
store_bitmap(index, bitmap, &oid, xor_bitmap, flags,
entry_map_pos);
}

return 0;
Expand Down Expand Up @@ -630,41 +635,28 @@ static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git,
bitmap_git->ext_index.positions = kh_init_oid_pos();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Taylor Blau wrote (reply to this):

On Tue, May 20, 2025 at 09:23:09AM +0000, Taylor Blau via GitGitGadget wrote:
> Signed-off-by: Taylor Blau <me@ttaylorr.com>

This commit forges my Signed-off-by, but I am happy with the result
here.

I do think the series is structured a little awkwardly as a result of
adding this patch to it. That this and the previous patch have the
subject "pack-bitmap: fix memory leak if `load_bitmap_entries_v1`
failed" make the series not quite as clear as it could be.

I think there are a couple of things going on:

  - This patch is a bug fix that could be applied independently of the
    first one. The rationale there would be that we shouldn't be leaking
    the EWAH bitmaps in 'b->bitmaps', but we are as a result of NULL'ing
    the pointer in the "failed" label. That patch can stand alone.

  - The first patch (yours) is no longer fixing a leak, at least after
    this patch. But it does delay reading the bitmap until we have
    validated its XOR offset for sanity, which is a good thing mostly
    from a performance perspective.

I would probably swap the two patches around so that yours applies on
top of mine, and then rewords the patch message in yours to reflect that
it is no longer fixing a leak.

That all said, if you feel strongly that the structure is fine/better
as-is, I'd be more than happy to discuss it further.

Thanks,
Taylor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, lidongyan wrote (reply to this):

2025年5月22日 07:54,Taylor Blau <me@ttaylorr.com> 写道:
> 
> On Tue, May 20, 2025 at 09:23:09AM +0000, Taylor Blau via GitGitGadget wrote:
>> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> 
> This commit forges my Signed-off-by, but I am happy with the result
> here.
> 
> I do think the series is structured a little awkwardly as a result of
> adding this patch to it. That this and the previous patch have the
> subject "pack-bitmap: fix memory leak if `load_bitmap_entries_v1`
> failed" make the series not quite as clear as it could be.
> 

Agreed. I’ve definitely learned a lot about how to write commit messages
 and cover letters through this process

> I think there are a couple of things going on:
> 
>  - This patch is a bug fix that could be applied independently of the
>    first one. The rationale there would be that we shouldn't be leaking
>    the EWAH bitmaps in 'b->bitmaps', but we are as a result of NULL'ing
>    the pointer in the "failed" label. That patch can stand alone.
> 
>  - The first patch (yours) is no longer fixing a leak, at least after
>    this patch. But it does delay reading the bitmap until we have
>    validated its XOR offset for sanity, which is a good thing mostly
>    from a performance perspective.
> 
> I would probably swap the two patches around so that yours applies on
> top of mine, and then rewords the patch message in yours to reflect that
> it is no longer fixing a leak.
> 
> That all said, if you feel strongly that the structure is fine/better
> as-is, I'd be more than happy to discuss it further.
> 
> Thanks,
> Taylor
> 

I think I can do this in third version, and I have to submit patch v3 after
we decide if patch v2 3/3 in this series should live or not. 

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Taylor Blau <me@ttaylorr.com> writes:

> On Tue, May 20, 2025 at 09:23:09AM +0000, Taylor Blau via GitGitGadget wrote:
>> Signed-off-by: Taylor Blau <me@ttaylorr.com>
>
> This commit forges my Signed-off-by, but I am happy with the result
> here.
>
> I do think the series is structured a little awkwardly as a result of
> adding this patch to it. That this and the previous patch have the
> subject "pack-bitmap: fix memory leak if `load_bitmap_entries_v1`
> failed" make the series not quite as clear as it could be.
>
> I think there are a couple of things going on:
>
>   - This patch is a bug fix that could be applied independently of the
>     first one. The rationale there would be that we shouldn't be leaking
>     the EWAH bitmaps in 'b->bitmaps', but we are as a result of NULL'ing
>     the pointer in the "failed" label. That patch can stand alone.
>
>   - The first patch (yours) is no longer fixing a leak, at least after
>     this patch. But it does delay reading the bitmap until we have
>     validated its XOR offset for sanity, which is a good thing mostly
>     from a performance perspective.
>
> I would probably swap the two patches around so that yours applies on
> top of mine, and then rewords the patch message in yours to reflect that
> it is no longer fixing a leak.

Sounds like a plausible structure.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Taylor Blau via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Taylor Blau <me@ttaylorr.com>
>
> After going through the "failed" label, load_bitmap() will return -1,
> and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
> will then call free_bitmap_index().
> ...
> The solution is to remove the error handling code in load_bitmap(), because
> its caller will always call free_bitmap_index() in case of an error.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---

As this is Lidong relaying <aCOFqYdnPp1Lne4Y@nand.local> that Taylor
sent to the list, shouldn't Lidong's sign-off be after Taylor's?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Taylor Blau wrote (reply to this):

On Thu, May 29, 2025 at 08:33:29AM -0700, Junio C Hamano wrote:
> "Taylor Blau via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Taylor Blau <me@ttaylorr.com>
> >
> > After going through the "failed" label, load_bitmap() will return -1,
> > and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
> > will then call free_bitmap_index().
> > ...
> > The solution is to remove the error handling code in load_bitmap(), because
> > its caller will always call free_bitmap_index() in case of an error.
> >
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
> > ---
>
> As this is Lidong relaying <aCOFqYdnPp1Lne4Y@nand.local> that Taylor
> sent to the list, shouldn't Lidong's sign-off be after Taylor's?

I've always assumed the answer here was "yes", but I don't know that our
documentation suggests the same.

In c11c3b5681 (Documentation/SubmittingPatches: What's Acked-by and
Tested-by?, 2008-02-03) you added:

    Notice that you can place your own Signed-off-by: line when
    forwarding somebody else's patch [...]. Indeed you are encouraged
    to do so.  [...]

and that text survives into the current version of SubmittingPatches.
So I think that while our documentation encourages people to add their
own S-o-b to others' patches sent on their behalf, it doesn't
explicitly require it.

Thanks,
Taylor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Taylor Blau <me@ttaylorr.com> writes:

> In c11c3b5681 (Documentation/SubmittingPatches: What's Acked-by and
> Tested-by?, 2008-02-03) you added:
>
>     Notice that you can place your own Signed-off-by: line when
>     forwarding somebody else's patch [...]. Indeed you are encouraged
>     to do so.  [...]
>
> and that text survives into the current version of SubmittingPatches.
> So I think that while our documentation encourages people to add their
> own S-o-b to others' patches sent on their behalf, it doesn't
> explicitly require it.

It would not hurt that much if I pretended that I ignored what
Lidong forwarded and picked up your patch directly from the list,
only because what was forwarded was public.  If it were a privately
shared patch, sign-off is required, so we'd need to tighten the
language in the SubmittingPatches document, I think.

Thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, lidongyan wrote (reply to this):

Got it, I will add my sign-off.

> 2025年5月29日 23:33,Junio C Hamano <gitster@pobox.com> 写道:
> 
> "Taylor Blau via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Taylor Blau <me@ttaylorr.com>
>> 
>> After going through the "failed" label, load_bitmap() will return -1,
>> and its caller (either prepare_bitmap_walk() or prepare_bitmap_git())
>> will then call free_bitmap_index().
>> ...
>> The solution is to remove the error handling code in load_bitmap(), because
>> its caller will always call free_bitmap_index() in case of an error.
>> 
>> Signed-off-by: Taylor Blau <me@ttaylorr.com>
>> ---
> 
> As this is Lidong relaying <aCOFqYdnPp1Lne4Y@nand.local> that Taylor
> sent to the list, shouldn't Lidong's sign-off be after Taylor's?
> 


if (load_reverse_index(r, bitmap_git))
goto failed;
return -1;

if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) ||
!(bitmap_git->trees = read_bitmap_1(bitmap_git)) ||
!(bitmap_git->blobs = read_bitmap_1(bitmap_git)) ||
!(bitmap_git->tags = read_bitmap_1(bitmap_git)))
goto failed;
return -1;

if (!bitmap_git->table_lookup && load_bitmap_entries_v1(bitmap_git) < 0)
goto failed;
return -1;

if (bitmap_git->base) {
if (!bitmap_is_midx(bitmap_git))
BUG("non-MIDX bitmap has non-NULL base bitmap index");
if (load_bitmap(r, bitmap_git->base, 1) < 0)
goto failed;
return -1;
}

if (!recursing)
load_all_type_bitmaps(bitmap_git);

return 0;

failed:
munmap(bitmap_git->map, bitmap_git->map_size);
bitmap_git->map = NULL;
bitmap_git->map_size = 0;

kh_destroy_oid_map(bitmap_git->bitmaps);
bitmap_git->bitmaps = NULL;

kh_destroy_oid_pos(bitmap_git->ext_index.positions);
bitmap_git->ext_index.positions = NULL;

return -1;
}

static int open_pack_bitmap(struct repository *r,
Expand Down Expand Up @@ -882,6 +874,7 @@ static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_
int xor_flags;
khiter_t hash_pos;
struct bitmap_lookup_table_xor_item *xor_item;
size_t entry_map_pos;

if (is_corrupt)
return NULL;
Expand Down Expand Up @@ -941,14 +934,16 @@ static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_
goto corrupt;
}

entry_map_pos = bitmap_git->map_pos;
bitmap_git->map_pos += sizeof(uint32_t) + sizeof(uint8_t);
xor_flags = read_u8(bitmap_git->map, &bitmap_git->map_pos);
bitmap = read_bitmap_1(bitmap_git);

if (!bitmap)
goto corrupt;

xor_bitmap = store_bitmap(bitmap_git, bitmap, &xor_item->oid, xor_bitmap, xor_flags);
xor_bitmap = store_bitmap(bitmap_git, bitmap, &xor_item->oid,
xor_bitmap, xor_flags, entry_map_pos);
xor_items_nr--;
}

Expand Down Expand Up @@ -982,14 +977,16 @@ static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_
* Instead, we can skip ahead and immediately read the flags and
* ewah bitmap.
*/
entry_map_pos = bitmap_git->map_pos;
bitmap_git->map_pos += sizeof(uint32_t) + sizeof(uint8_t);
flags = read_u8(bitmap_git->map, &bitmap_git->map_pos);
bitmap = read_bitmap_1(bitmap_git);

if (!bitmap)
goto corrupt;

return store_bitmap(bitmap_git, bitmap, oid, xor_bitmap, flags);
return store_bitmap(bitmap_git, bitmap, oid, xor_bitmap, flags,
entry_map_pos);

corrupt:
free(xor_items);
Expand Down Expand Up @@ -2852,8 +2849,9 @@ int test_bitmap_commits(struct repository *r)
die(_("failed to load bitmap indexes"));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Taylor Blau wrote (reply to this):

On Tue, Jun 03, 2025 at 03:14:03AM +0000, Lidong Yan via GitGitGadget wrote:
> From: Lidong Yan <502024330056@smail.nju.edu.cn>
>
> In pack-bitmap.c:test_bitmap_commits(), it comments
>
>     /*
>      * As this function is only used to print bitmap selected
>      * commits, we don't have to read the commit table.
>      */
>

There is no need to include the original comment here, since it is clear
from the patch below what you're referring to.

I don't think this alone is worth rerolling the series, but others may
feel differently.

> This suggests that we can avoid reading the commit table altogether.
> However, this comment is misleading. The reason we load bitmap entries here
> is because test_bitmap_commits() needs to print the commit IDs from the
> bitmap, and we must read the bitmap entries to obtain those commit IDs.
> So reword this comment.
>
> Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
> ---
>  pack-bitmap.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index fd19c2255163..e514c9da239b 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -2839,8 +2839,9 @@ int test_bitmap_commits(struct repository *r)
>  		die(_("failed to load bitmap indexes"));
>
>  	/*
> -	 * As this function is only used to print bitmap selected
> -	 * commits, we don't have to read the commit table.
> +	 * Since this function needs to print bitmap selected

The phrase "bitmap selected commits" is a little awkward. I might have
written either "the bitmapped commits", or "the set of commits which
have bitmaps".

> +	 * commits, bypass the commit lookup table (if one exists)
> +	 * by forcing the bitmap to eagerly load its entries.
>  	 */
>  	if (bitmap_git->table_lookup) {
>  		if (load_bitmap_entries_v1(bitmap_git) < 0)
> --
> gitgitgadget
>
Thanks,
Taylor


/*
* As this function is only used to print bitmap selected
* commits, we don't have to read the commit table.
* Since this function needs to print the bitmapped
* commits, bypass the commit lookup table (if one exists)
* by forcing the bitmap to eagerly load its entries.
*/
if (bitmap_git->table_lookup) {
if (load_bitmap_entries_v1(bitmap_git) < 0)
Expand All @@ -2869,6 +2867,48 @@ int test_bitmap_commits(struct repository *r)
return 0;
}

int test_bitmap_commits_with_offset(struct repository *r)
{
struct object_id oid;
struct stored_bitmap *stored;
struct bitmap_index *bitmap_git;
size_t commit_idx_pos_map_pos, xor_offset_map_pos, flag_map_pos,
ewah_bitmap_map_pos;

bitmap_git = prepare_bitmap_git(r);
if (!bitmap_git)
die(_("failed to load bitmap indexes"));

/*
* Since this function needs to know the position of each individual
* bitmap, bypass the commit lookup table (if one exists) by forcing
* the bitmap to eagerly load its entries.
*/
if (bitmap_git->table_lookup) {
if (load_bitmap_entries_v1(bitmap_git) < 0)
die(_("failed to load bitmap indexes"));
}

kh_foreach (bitmap_git->bitmaps, oid, stored, {
commit_idx_pos_map_pos = stored->map_pos;
xor_offset_map_pos = stored->map_pos + sizeof(uint32_t);
flag_map_pos = xor_offset_map_pos + sizeof(uint8_t);
ewah_bitmap_map_pos = flag_map_pos + sizeof(uint8_t);

printf_ln("%s %"PRIuMAX" %"PRIuMAX" %"PRIuMAX" %"PRIuMAX,
oid_to_hex(&oid),
(uintmax_t)commit_idx_pos_map_pos,
(uintmax_t)xor_offset_map_pos,
(uintmax_t)flag_map_pos,
(uintmax_t)ewah_bitmap_map_pos);
})
;

free_bitmap_index(bitmap_git);

return 0;
}

int test_bitmap_hashes(struct repository *r)
{
struct bitmap_index *bitmap_git = prepare_bitmap_git(r);
Expand Down
1 change: 1 addition & 0 deletions pack-bitmap.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ void traverse_bitmap_commit_list(struct bitmap_index *,
show_reachable_fn show_reachable);
void test_bitmap_walk(struct rev_info *revs);
int test_bitmap_commits(struct repository *r);
int test_bitmap_commits_with_offset(struct repository *r);
int test_bitmap_hashes(struct repository *r);
int test_bitmap_pseudo_merges(struct repository *r);
int test_bitmap_pseudo_merge_commits(struct repository *r, uint32_t n);
Expand Down
8 changes: 8 additions & 0 deletions t/helper/test-bitmap.c
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ static int bitmap_list_commits(void)
return test_bitmap_commits(the_repository);
}

static int bitmap_list_commits_with_offset(void)
{
return test_bitmap_commits_with_offset(the_repository);
}

static int bitmap_dump_hashes(void)
{
return test_bitmap_hashes(the_repository);
Expand All @@ -36,6 +41,8 @@ int cmd__bitmap(int argc, const char **argv)

if (argc == 2 && !strcmp(argv[1], "list-commits"))
return bitmap_list_commits();
if (argc == 2 && !strcmp(argv[1], "list-commits-with-offset"))
return bitmap_list_commits_with_offset();
if (argc == 2 && !strcmp(argv[1], "dump-hashes"))
return bitmap_dump_hashes();
if (argc == 2 && !strcmp(argv[1], "dump-pseudo-merges"))
Expand All @@ -46,6 +53,7 @@ int cmd__bitmap(int argc, const char **argv)
return bitmap_dump_pseudo_merge_objects(atoi(argv[2]));

usage("\ttest-tool bitmap list-commits\n"
"\ttest-tool bitmap list-commits-with-offset\n"
"\ttest-tool bitmap dump-hashes\n"
"\ttest-tool bitmap dump-pseudo-merges\n"
"\ttest-tool bitmap dump-pseudo-merge-commits <n>\n"
Expand Down
30 changes: 30 additions & 0 deletions t/t5310-pack-bitmaps.sh
Original file line number Diff line number Diff line change
Expand Up @@ -495,6 +495,36 @@ test_bitmap_cases () {
grep "ignoring extra bitmap" trace2.txt
)
'

test_expect_success 'load corrupt bitmap' '
rm -fr repo &&
git init repo &&
test_when_finished "rm -fr repo" &&
(
cd repo &&
git config pack.writeBitmapLookupTable '"$writeLookupTable"' &&

test_commit base &&

git repack -adb &&
bitmap="$(ls .git/objects/pack/pack-*.bitmap)" &&
chmod +w $bitmap &&

test-tool bitmap list-commits-with-offset >offsets &&
xor_off=$(head -n1 offsets | awk "{print \$3}") &&
printf '\161' |
dd of=$bitmap count=1 bs=1 conv=notrunc seek=$xor_off &&

git rev-list --objects --no-object-names HEAD >expect.raw &&
git rev-list --objects --use-bitmap-index --no-object-names HEAD \
>actual.raw &&

sort expect.raw >expect &&
sort actual.raw >actual &&

test_cmp expect actual
)
'
}

test_bitmap_cases
Expand Down
Loading
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载