[pull] master from gitster:master #193

pull · 2025-07-16T02:47:18Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.3)

Can you help keep this open source service alive? 💖 Please sponsor : )

After going through the "failed" label, load_bitmap() will return -1, and its caller (either prepare_bitmap_walk() or prepare_bitmap_git()) will then call free_bitmap_index(). That function would have done: struct stored_bitmap *sb; kh_foreach_value(b->bitmaps, sb { ewah_pool_free(sb->root); free(sb); }); , but won't since load_bitmap() already called kh_destroy_oid_map() and NULL'd the "bitmaps" pointer from within its "failed" label. Thus if you got part of the way through loading bitmap entries and then failed, you would leak all of the previous entries that you were able to load successfully. The solution is to remove the error handling code in load_bitmap(), because its caller will always call free_bitmap_index() in case of an error. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The comment in pack-bitmap.c:test_bitmap_commits(), suggests that we can avoid reading the commit table altogether. However, this comment is misleading. The reason we load bitmap entries here is because test_bitmap_commits() needs to print the commit IDs from the bitmap, and we must read the bitmap entries to obtain those commit IDs. So reword this comment. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>

t5310 lacks a test to ensure git works correctly when commit bitmap data is corrupted. So this patch add test helper in pack-bitmap.c to list each commit bitmap position in bitmap file and `load corrupt bitmap` test case in t/t5310 to corrupt a commit bitmap before loading it. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The `raw_object_store` structure is the central entry point for reading and writing objects in a repository. The main purpose of this structure is to manage object directories and provide an interface to access and write objects in those object directories. Right now, many of the functions associated with the raw object store implicitly rely on `the_repository` to get access to its `objects` pointer, which is the `raw_object_store`. As we want to generally get rid of using `the_repository` across our codebase we will have to convert this implicit dependency on this global variable into an explicit parameter. This conversion can be done by simply passing in an explicit pointer to a repository and then using its `->objects` pointer. But there is a second effort underway, which is to make the object subsystem more selfcontained so that we can eventually have pluggable object backends. As such, passing in a repository wouldn't make a ton of sense, and the goal is to convert the object store interfaces such that we always pass in a reference to the `raw_object_store` instead. This will expose the `raw_object_store` type to a lot more callers though, which surfaces that this type is named somewhat awkwardly. The "raw_" prefix makes readers wonder whether there is a non-raw variant of the object store, but there isn't. Furthermore, we nowadays want to name functions in a way that they can be clearly attributed to a specific subsystem, but calling them e.g. `raw_object_store_has_object()` is just too unwieldy, even when dropping the "raw_" prefix. Instead, rename the structure to `object_database`. This term is already used a lot throughout our codebase, and it cannot easily be mistaken for "object directories", either. Furthermore, its acronym ODB is already well-known and works well as part of a function's name, like for example `odb_has_object()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The `object_directory` structure is used as an access point for a single object directory like ".git/objects". While the structure isn't yet fully self-contained, the intent is for it to eventually contain all information required to access objects in one specific location. While the name "object directory" is a good fit for now, this will change over time as we continue with the agenda to make pluggable object databases a thing. Eventually, objects may not be accessed via any kind of directory at all anymore, but they could instead be backed by any kind of durable storage mechanism. While it seems quite far-fetched for now, it is thinkable that eventually this might even be some form of a database, for example. As such, the current name of this structure will become worse over time as we evolve into the direction of pluggable ODBs. Immediate next steps will start to carve out proper self-contained object directories, which requires us to pass in these object directories as parameters. Based on our modern naming schema this means that those functions should then be named after their subsystem, which means that we would start to bake the current name into the codebase more and more. Let's preempt this by renaming the structure. There have been a couple alternatives that were discussed: - `odb_backend` was discarded because it led to the association that one object database has a single backend, but the model is that one alternate has one backend. Furthermore, "backend" is more about the actual backing implementation and less about the high-level concept. - `odb_alternate` was discarded because it is a bit of a stretch to also call the main object directory an "alternate". Instead, pick `odb_source` as the new name. It makes it sufficiently clear that there can be multiple sources and does not cause confusion when mixed with the already-existing "alternate" terminology. In the future, this change allows us to easily introduce for example a `odb_files_source` and other format-specific implementations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In subsequent commits we'll get rid of our use of `the_repository` in "odb.c" in favor of explicitly passing in a `struct object_database` or a `struct odb_source`. In some cases though we'll need access to the repository, for example to read a config value from it, but we don't have a way to access the repository owning a specific object database. Introduce parent pointers for `struct object_database` to its owning repository as well as for `struct odb_source` to its owning object database, which will allow us to adapt those use cases. Note that this change requires us to pass through the object database to `link_alt_odb_entry()` so that we can set up the parent pointers for any source there. The callchain is adapted to pass through the object database accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Get rid of our dependency on `the_repository` in `find_odb()` by passing in the object database in which we want to search for the source and adjusting all callers. Rename the function to `odb_find_source()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Get rid of our dependency on `the_repository` in `assert_oid_type()` by passing in the object database as a parameter and adjusting all callers. Rename the function to `odb_assert_oid_type()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Get rid of our dependency on `the_repository` in `odb_mkstemp()` by passing in the object database as a parameter and adjusting all callers. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The functions to manage alternates all depend on `the_repository`. Refactor them to accept an object database as a parameter and adjust all callers. The functions are renamed accordingly. Note that right now the situation is still somewhat weird because we end up using the object store path provided by the object store's repository anyway. Consequently, we could have instead passed in a pointer to the repository instead of passing in the pointer to the object store. This will be addressed in subsequent commits though, where we will start to use the path owned by the object store itself. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

There are a couple of iterator-style functions that execute a callback for each instance of a given set, all of which currently depend on `the_repository`. Refactor them to instead take an object database as parameter so that we can get rid of this dependency. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The functions `set_temporary_primary_odb()` and `restore_primary_odb()` are responsible for managing a temporary primary source for the database. Both of these functions implicitly rely on `the_repository`. Refactor them to instead take an explicit object database parameter as argument and adjust callers. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The "--recursive" flag for git-grep(1) allows users to grep for a string across submodule boundaries. To make this work we add each submodule's object sources to our own object database so that the objects can be accessed directly. The infrastructure for this depends on a global string list of submodule paths. The caller is expected to call `add_submodule_odb_by_path()` for each source and the object database will then eventually register all submodule sources via `do_oid_object_info_extended()` in case it isn't able to look up a specific object. This reliance on global state is of course suboptimal with regards to our libification efforts. Refactor the logic so that the list of submodule sources is instead tracked in the object database itself. This allows us to lose the condition of `r == the_repository` before registering submodule sources as we only ever add submodule sources to `the_repository` anyway. As such, behaviour before and after this refactoring should always be the same. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

All of the external functions provided by the object database subsystem don't depend on `the_repository` anymore, but some internal functions still do. Refactor those cases by plumbing through the repository that owns the object database. This change allows us to get rid of the `USE_THE_REPOSITORY_VARIABLE` preprocessor define. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Rename `repo_read_object_file()` to `odb_read_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Rename `pretend_object_file()` to `odb_pretend_object()` to match other functions related to the object database and our modern coding guidelines. No compatibility wrapper is introduced as the function is not used a lot throughout our codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Rename `read_object_with_reference()` to `odb_read_object_peeled()` to match other functions related to the object database and our modern coding guidelines. Furthermore though, the old name didn't really describe very well what this function actually does, which is to walk down any commit and tag objects until an object of the required type has been found. This is generally referred to as "peeling", so the new name should be way more descriptive. No compatibility wrapper is introduced as the function is not used a lot throughout our codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`

Leakfix with a new and a bit invasive test. * ly/load-bitmap-leakfix: pack-bitmap: add load corrupt bitmap test pack-bitmap: reword comments in test_bitmap_commits() pack-bitmap: fix memory leak if load_bitmap() failed

Signed-off-by: Junio C Hamano <gitster@pobox.com>

ttaylorr and others added 23 commits July 1, 2025 14:41

odb: get rid of the_repository in odb_mkstemp()

1b1679c

Get rid of our dependency on `the_repository` in `odb_mkstemp()` by passing in the object database as a parameter and adjusting all callers. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Merge branch 'ly/load-bitmap-leakfix'

f31d155

Leakfix with a new and a bit invasive test. * ly/load-bitmap-leakfix: pack-bitmap: add load corrupt bitmap test pack-bitmap: reword comments in test_bitmap_commits() pack-bitmap: fix memory leak if load_bitmap() failed

The tenth batch

32571a0

Signed-off-by: Junio C Hamano <gitster@pobox.com>

pull bot locked and limited conversation to collaborators Jul 16, 2025

pull bot added the ⤵️ pull label Jul 16, 2025

pull bot merged commit 32571a0 into chojar:master Jul 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] master from gitster:master #193

[pull] master from gitster:master #193

Uh oh!

pull bot commented Jul 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

[pull] master from gitster:master #193

[pull] master from gitster:master #193

Uh oh!

Conversation

pull bot commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pull bot commented Jul 16, 2025 •

edited

Loading