-
Notifications
You must be signed in to change notification settings - Fork 26.4k
repository: prevent memory leak when releasing ref stores #1758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
Typofix. * ks/unit-test-comment-typofix: unit-tests/test-lib: fix typo in check_pointer_eq() description
A 'P' command to "git add -p" that passes the patch hunk to the pager has been added. * rj/add-p-pager: add-patch: render hunks through the pager pager: introduce wait_for_pager pager: do not close fd 2 unnecessarily add-patch: test for 'p' command
Test script linter has been updated to catch an attempt to use one-shot export construct "VAR=VAL func" for shell functions (which does not work for some shells) better. * es/shell-check-updates: check-non-portable-shell: improve `VAR=val shell-func` detection check-non-portable-shell: suggest alternative for `VAR=val shell-func` check-non-portable-shell: loosen one-shot assignment error message t4034: fix use of one-shot variable assignment with shell function t3430: drop unnecessary one-shot "VAR=val shell-func" invocation
"git notes add -m '' --allow-empty" and friends that take prepared data to create notes should not invoke an editor, but it started doing so since Git 2.42, which has been corrected. * dd/notes-empty-no-edit-by-default: notes: do not trigger editor when adding an empty note
"git grep -W" omits blank lines that follow the found function at the end of the file, just like it omits blank lines before the next function. * rs/grep-omit-blank-lines-after-function-at-eof: grep: -W: skip trailing empty lines at EOF, too
Some project conventions have been added to CodingGuidelines. * ps/doc-more-c-coding-guidelines: Documentation: consistently use spaces inside initializers Documentation: document idiomatic function names Documentation: document naming schema for structs and their functions Documentation: clarify indentation style for C preprocessor directives clang-format: fix indentation width for preprocessor directives
An expensive operation to prepare tracing was done in re-encoding code path even when the tracing was not requested, which has been corrected. * dh/encoding-trace-optim: convert: return early when not tracing
Perforce tests have been updated. * ps/p4-tests-updates: t98xx: mark Perforce tests as memory-leak free ci: update Perforce version to r23.2 t98xx: fix Perforce tests with p4d r23 and newer
`ref_store_release` does not free the ref_store allocated in `ref_store_init`. Signed-off-by: Sven Strickroth <email@cs-ware.de>
/submit |
Submitted as pull.1758.git.git.1722855364436.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
After another check this does not seem to be necessary. The ref stores are already free'd in |
On the Git mailing list, Sven Strickroth wrote (reply to this): Am 05.08.2024 um 12:56 schrieb Sven Strickroth via GitGitGadget:
> - strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e)
> + strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e) {
> ref_store_release(e->value);
> + free(e->value);
> + }
> strmap_clear(&repo->submodule_ref_stores, 1);
After further checking this does not seem to be necessary. The ref stores are already free'd in strmap_clear.
--
Best regards,
Sven Strickroth
PGP key id F5A9D4C4 @ any key-server |
On the Git mailing list, Junio C Hamano wrote (reply to this): "Sven Strickroth via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Sven Strickroth <email@cs-ware.de>
>
> `ref_store_release` does not free the ref_store allocated in
> `ref_store_init`.
>
> Signed-off-by: Sven Strickroth <email@cs-ware.de>
> ---
This may certainly plug the two leaking callers, but stepping back a
bit and looking at other existing calls to ref_store_release(), I
wonder if many existing and more importantly future callers benefit
if ref_store_release() did the freeing of the surrounding shell, as
we can see in these existing calls:
refs.c:2851: ref_store_release(new_refs);
refs.c-2852- FREE_AND_NULL(new_refs);
refs.c:2890: ref_store_release(old_refs);
refs.c-2891- FREE_AND_NULL(old_refs);
refs.c:2904: ref_store_release(new_refs);
refs.c-2905- free(new_refs);
If we change the type of ref_store_release() to take a pointer to a
pointer to ref_store, so that the above callers can just become
ref_store_release(&new_refs);
to release the resources and new_refs variable cleared, the
callsites in this patch can do the same.
However, I am fuzzy on the existing uses in the backend
implementation. For example:
static void files_ref_store_release(struct ref_store *ref_store)
{
struct files_ref_store *refs = files_downcast(ref_store, 0, "release");
free_ref_cache(refs->loose);
free(refs->gitcommondir);
ref_store_release(refs->packed_ref_store);
}
The packed-ref-store is "released" here, as part of "releasing" the
files-ref-store that uses it as a fallback backend. The caller of
files_ref_store_release() is refs.c:ref_store_release()
void ref_store_release(struct ref_store *ref_store)
{
ref_store->be->release(ref_store);
free(ref_store->gitdir);
}
So if you have a files based ref store, when you are done you'd be
calling ref_store_release() on it, releasing the resources held by
the files_ref_store structure, but I do not know who frees the
packed_ref_store allocated by files_ref_store_init(). Perhaps it is
already leaking? If that is the case then an API update like I
suggested above would make even more sense to make it less likely
for such a leak to be added to the system in the future, I suspect.
I dunno.
Thanks.
> repository: prevent memory leak when releasing ref stores
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1758%2Fcsware%2Frepository-memory-leak-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1758/csware/repository-memory-leak-v1
> Pull-Request: https://github.com/git/git/pull/1758
>
> repository.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/repository.c b/repository.c
> index 9825a308993..46f1eadfe95 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -366,12 +366,16 @@ void repo_clear(struct repository *repo)
> FREE_AND_NULL(repo->remote_state);
> }
>
> - strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e)
> + strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e) {
> ref_store_release(e->value);
> + free(e->value);
> + }
> strmap_clear(&repo->submodule_ref_stores, 1);
>
> - strmap_for_each_entry(&repo->worktree_ref_stores, &iter, e)
> + strmap_for_each_entry(&repo->worktree_ref_stores, &iter, e) {
> ref_store_release(e->value);
> + free(e->value);
> + }
> strmap_clear(&repo->worktree_ref_stores, 1);
>
> repo_clear_path_cache(&repo->cached_paths);
>
> base-commit: e559c4bf1a306cf5814418d318cc0fea070da3c7 |
On the Git mailing list, Junio C Hamano wrote (reply to this): Junio C Hamano <gitster@pobox.com> writes:
> However, I am fuzzy on the existing uses in the backend
> implementation. For example:
>
> static void files_ref_store_release(struct ref_store *ref_store)
> {
> struct files_ref_store *refs = files_downcast(ref_store, 0, "release");
> free_ref_cache(refs->loose);
> free(refs->gitcommondir);
> ref_store_release(refs->packed_ref_store);
> }
>
> The packed-ref-store is "released" here, as part of "releasing" the
> files-ref-store that uses it as a fallback backend. The caller of
> files_ref_store_release() is refs.c:ref_store_release()
>
> void ref_store_release(struct ref_store *ref_store)
> {
> ref_store->be->release(ref_store);
> free(ref_store->gitdir);
> }
>
> So if you have a files based ref store, when you are done you'd be
> calling ref_store_release() on it, releasing the resources held by
> the files_ref_store structure, but I do not know who frees the
> packed_ref_store allocated by files_ref_store_init(). Perhaps it is
> already leaking? If that is the case then an API update like I
> suggested above would make even more sense to make it less likely
> for such a leak to be added to the system in the future, I suspect.
Ahh, that was the leak that you plugged in a separate patch.
So it does point us in the other direction to redefine _release with
a different behaviour that releases the resource held by the
structure, and frees the structure itself.
Something along the following line (caution: totally untested) that
allows your two patches to become empty, and also allows a few
callers to lose their existing explicit free()s immediately after
they call _release(), perhaps?
If this were to become a real patch, I think debug backend should
learn to use the same _downcast() to become more like the real ones
before it happens in a preliminary clean-up patch.
refs.h | 5 +++--
refs.c | 19 ++++++++-----------
refs/refs-internal.h | 2 +-
refs/files-backend.c | 6 +++---
refs/packed-backend.c | 5 +++--
refs/reftable-backend.c | 6 +++---
refs/debug.c | 6 +++---
7 files changed, 24 insertions(+), 25 deletions(-)
diff --git c/refs.h w/refs.h
index b3e39bc257..e4f092f6ac 100644
--- c/refs.h
+++ w/refs.h
@@ -119,9 +119,10 @@ int is_branch(const char *refname);
int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *err);
/*
- * Release all memory and resources associated with the ref store.
+ * Release all memory and resources associated with the ref store, including
+ * the ref_store itself.
*/
-void ref_store_release(struct ref_store *ref_store);
+void ref_store_release(struct ref_store **ref_store);
/*
* Remove the ref store from disk. This deletes all associated data.
diff --git c/refs.c w/refs.c
index 915aeb4d1d..cb76a5d4bd 100644
--- c/refs.c
+++ w/refs.c
@@ -1936,10 +1936,11 @@ static struct ref_store *ref_store_init(struct repository *repo,
return refs;
}
-void ref_store_release(struct ref_store *ref_store)
+void ref_store_release(struct ref_store **ref_store)
{
- ref_store->be->release(ref_store);
- free(ref_store->gitdir);
+ (*ref_store)->be->release(ref_store);
+ free((*ref_store)->gitdir);
+ FREE_AND_NULL(*ref_store);
}
struct ref_store *get_main_ref_store(struct repository *r)
@@ -2848,8 +2849,7 @@ int repo_migrate_ref_storage_format(struct repository *repo,
* be closed. This is required for platforms like Cygwin, where
* renaming an open file results in EPERM.
*/
- ref_store_release(new_refs);
- FREE_AND_NULL(new_refs);
+ ref_store_release(&new_refs);
/*
* Until now we were in the non-destructive phase, where we only
@@ -2887,8 +2887,7 @@ int repo_migrate_ref_storage_format(struct repository *repo,
* make sure to lazily re-initialize the repository's ref store with
* the new format.
*/
- ref_store_release(old_refs);
- FREE_AND_NULL(old_refs);
+ ref_store_release(&old_refs);
repo->refs_private = NULL;
ret = 0;
@@ -2900,10 +2899,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
new_gitdir.buf);
}
- if (new_refs) {
- ref_store_release(new_refs);
- free(new_refs);
- }
+ if (new_refs)
+ ref_store_release(&new_refs);
ref_transaction_free(transaction);
strbuf_release(&new_gitdir);
return ret;
diff --git c/refs/refs-internal.h w/refs/refs-internal.h
index fa975d69aa..2ba2372acb 100644
--- c/refs/refs-internal.h
+++ w/refs/refs-internal.h
@@ -511,7 +511,7 @@ typedef struct ref_store *ref_store_init_fn(struct repository *repo,
/*
* Release all memory and resources associated with the ref store.
*/
-typedef void ref_store_release_fn(struct ref_store *refs);
+typedef void ref_store_release_fn(struct ref_store **refs);
typedef int ref_store_create_on_disk_fn(struct ref_store *refs,
int flags,
diff --git c/refs/files-backend.c w/refs/files-backend.c
index aa52d9be7c..8ebb1681ac 100644
--- c/refs/files-backend.c
+++ w/refs/files-backend.c
@@ -151,12 +151,12 @@ static struct files_ref_store *files_downcast(struct ref_store *ref_store,
return refs;
}
-static void files_ref_store_release(struct ref_store *ref_store)
+static void files_ref_store_release(struct ref_store **ref_store)
{
- struct files_ref_store *refs = files_downcast(ref_store, 0, "release");
+ struct files_ref_store *refs = files_downcast(*ref_store, 0, "release");
free_ref_cache(refs->loose);
free(refs->gitcommondir);
- ref_store_release(refs->packed_ref_store);
+ ref_store_release(&refs->packed_ref_store);
}
static void files_reflog_path(struct files_ref_store *refs,
diff --git c/refs/packed-backend.c w/refs/packed-backend.c
index a0666407cd..8321a4cc17 100644
--- c/refs/packed-backend.c
+++ w/refs/packed-backend.c
@@ -260,13 +260,14 @@ static void clear_snapshot(struct packed_ref_store *refs)
}
}
-static void packed_ref_store_release(struct ref_store *ref_store)
+static void packed_ref_store_release(struct ref_store **ref_store)
{
- struct packed_ref_store *refs = packed_downcast(ref_store, 0, "release");
+ struct packed_ref_store *refs = packed_downcast(*ref_store, 0, "release");
clear_snapshot(refs);
rollback_lock_file(&refs->lock);
delete_tempfile(&refs->tempfile);
free(refs->path);
+ FREE_AND_NULL(*ref_store);
}
static NORETURN void die_unterminated_line(const char *path,
diff --git c/refs/reftable-backend.c w/refs/reftable-backend.c
index fbe74c239d..0af010acfb 100644
--- c/refs/reftable-backend.c
+++ w/refs/reftable-backend.c
@@ -337,9 +337,9 @@ static struct ref_store *reftable_be_init(struct repository *repo,
return &refs->base;
}
-static void reftable_be_release(struct ref_store *ref_store)
+static void reftable_be_release(struct ref_store **ref_store)
{
- struct reftable_ref_store *refs = reftable_be_downcast(ref_store, 0, "release");
+ struct reftable_ref_store *refs = reftable_be_downcast(*ref_store, 0, "release");
struct strmap_entry *entry;
struct hashmap_iter iter;
@@ -400,7 +400,7 @@ static int reftable_be_remove_on_disk(struct ref_store *ref_store,
* required so that the "tables.list" file is not open anymore, which
* would otherwise make it impossible to remove the file on Windows.
*/
- reftable_be_release(ref_store);
+ reftable_be_release(&ref_store);
strbuf_addf(&sb, "%s/reftable", refs->base.gitdir);
if (remove_dir_recursively(&sb, 0) < 0) {
diff --git c/refs/debug.c w/refs/debug.c
index 547d9245b9..be6230045c 100644
--- c/refs/debug.c
+++ w/refs/debug.c
@@ -33,10 +33,10 @@ struct ref_store *maybe_debug_wrap_ref_store(const char *gitdir, struct ref_stor
return (struct ref_store *)res;
}
-static void debug_release(struct ref_store *refs)
+static void debug_release(struct ref_store **refs)
{
- struct debug_ref_store *drefs = (struct debug_ref_store *)refs;
- drefs->refs->be->release(drefs->refs);
+ struct debug_ref_store *drefs = *(struct debug_ref_store **)refs;
+ drefs->refs->be->release(&drefs->refs);
trace_printf_key(&trace_refs, "release\n");
}
|
On the Git mailing list, Junio C Hamano wrote (reply to this): Sven Strickroth <email@cs-ware.de> writes:
> Am 05.08.2024 um 12:56 schrieb Sven Strickroth via GitGitGadget:
>> - strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e)
>> + strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e) {
>> ref_store_release(e->value);
>> + free(e->value);
>> + }
>> strmap_clear(&repo->submodule_ref_stores, 1);
>
> After further checking this does not seem to be necessary. The ref
> stores are already free'd in strmap_clear.
Is it "not necessary" or "actively harmful"? It sounds like the
latter? |
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Mon, Aug 05, 2024 at 10:24:10AM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > However, I am fuzzy on the existing uses in the backend
> > implementation. For example:
> >
> > static void files_ref_store_release(struct ref_store *ref_store)
> > {
> > struct files_ref_store *refs = files_downcast(ref_store, 0, "release");
> > free_ref_cache(refs->loose);
> > free(refs->gitcommondir);
> > ref_store_release(refs->packed_ref_store);
> > }
> >
> > The packed-ref-store is "released" here, as part of "releasing" the
> > files-ref-store that uses it as a fallback backend. The caller of
> > files_ref_store_release() is refs.c:ref_store_release()
> >
> > void ref_store_release(struct ref_store *ref_store)
> > {
> > ref_store->be->release(ref_store);
> > free(ref_store->gitdir);
> > }
> >
> > So if you have a files based ref store, when you are done you'd be
> > calling ref_store_release() on it, releasing the resources held by
> > the files_ref_store structure, but I do not know who frees the
> > packed_ref_store allocated by files_ref_store_init(). Perhaps it is
> > already leaking? If that is the case then an API update like I
> > suggested above would make even more sense to make it less likely
> > for such a leak to be added to the system in the future, I suspect.
>
> Ahh, that was the leak that you plugged in a separate patch.
>
> So it does point us in the other direction to redefine _release with
> a different behaviour that releases the resource held by the
> structure, and frees the structure itself.
>
> Something along the following line (caution: totally untested) that
> allows your two patches to become empty, and also allows a few
> callers to lose their existing explicit free()s immediately after
> they call _release(), perhaps?
I don't really know whether it's worth the churn, but if somebody wants
to pull through with this I'm game :) But: if we are going to do this,
we should rename the function to be called `ref_store_free()` instead of
`ref_store_release()` according to our recent coding style update :)
> If this were to become a real patch, I think debug backend should
> learn to use the same _downcast() to become more like the real ones
> before it happens in a preliminary clean-up patch.
That certainly wouldn't hurt, yeah.
Patrick |
On the Git mailing list, Junio C Hamano wrote (reply to this): Patrick Steinhardt <ps@pks.im> writes:
>> Something along the following line (caution: totally untested) that
>> allows your two patches to become empty, and also allows a few
>> callers to lose their existing explicit free()s immediately after
>> they call _release(), perhaps?
>
> I don't really know whether it's worth the churn, but if somebody wants
> to pull through with this I'm game :) But: if we are going to do this,
> we should rename the function to be called `ref_store_free()` instead of
> `ref_store_release()` according to our recent coding style update :)
Yes, we had "what is release, clear, and free?" discussion recently.
>> If this were to become a real patch, I think debug backend should
>> learn to use the same _downcast() to become more like the real ones
>> before it happens in a preliminary clean-up patch.
>
> That certainly wouldn't hurt, yeah.
I am not short of (other) things to do, and expect that I will not
touching this for a while, but in case somebody finds #leftoverbits
here, I'll leave a note here.
There are "hidden" freeing that we have to adjust, if we were to
follow through this approach. For example, those free()'s added in
the patch in the message that started the thread are introduing
double free---after the strmap_for_each_entry() loop, a
strmap_clear() callis done with free_values=1. If we freed inside
ref_store_release(), we'd need to adjust the call to strmap_clear()
to pass free_values=0 to compensate. |
CC: Patrick Steinhardt ps@pks.im