+
Skip to content

Introduce git-last-modified(1) command #1942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

To1ne
Copy link

@To1ne To1ne commented Jul 9, 2025

This series adds the git-last-modified(1) subcommand. In the past the
subcommand was proposed1 to be named git-blame-tree(1). This version
is based on the patches shared by the kind people at GitHub2.

What is different from the series shared by GitHub:

  • Renamed the subcommand from blame-tree to last-modified. There was
    some consensus5 this name works better, so let's give it a try and
    see how this name feels.

  • Patches for --max-depth are excluded. I think it's a separate topic to
    discuss and I'm not sure it needs to be part of series anyway. The
    main patch was submitted in the previous attempt3 and if people
    consider it valuable, I'm happy to discuss that in a separate patch
    series.

  • The patches in 'tb/blame-tree' at Taylor's fork4 implements a
    caching layer. This feature reads/writes cached results in
    .git/blame-tree/<hash>.btc. To keep this series to a reviewable
    size, that feature is excluded from this series. I think it's better
    to submit this as a separate series.

  • Squashed various commits together. Like they introduced a flag
    --go-faster, which later became the default and only implementation.
    That story was wrapped up in a single commit.

  • Dropped the patches that attempt to increase performance for tree
    entries that have not been updated in a long time. In my testing I've
    seen both performance improvements and degradation with these
    changes:

    Test HEAD~ HEAD

    8020.1: top-level last-modified 4.52(4.38+0.11) 2.03(1.93+0.08) -55.1%
    8020.2: top-level recursive last-modified 5.79(5.64+0.11) 8.34(8.17+0.11) +44.0%
    8020.3: subdir last-modified 0.15(0.09+0.06) 0.19(0.14+0.06) +26.7%

    Before we include these patches, I want to make sure these changes
    have positive impact in all/most scenarios. This can happen in a
    separate series.

  • The last-modified command isn't recursive by default. If you want
    recurse into subtrees, you need to pass -r.

  • Fixed all memory leaks, and removed the use of
    USE_THE_REPOSITORY_VARIABLE.

I've set myself as the author and added Based-on-patch-by trailers to
credit the original authors. Let me know if you disagree.

Again thanks to Taylor and the people at GitHub for sharing these
patches. I hope we can work together to get this upstreamed.

--
Cheers,
Toon

Signed-off-by: Toon Claes toon@iotcl.com

Changes in v4:

Changes in v3:

  • Updated benchmarks in commit messages.
  • Removed the patches that attempt to increase performance for tree
    entries that have not been updated in a long time. (see above)
  • Move handling failure in last_modified_init() to the caller.
  • Sorted #include clauses lexicographically.
  • Removed unneeded commit in struct last_modified_entry.
  • Renamed some functions/variables and added some comments to make it
    easier to understand.
  • Removed unnecessary checking of the commit-graph generation number.
  • Link to v2: https://lore.kernel.org/r/20250523-toon-new-blame-tree-v2-0-101e4ca4c1c9@iotcl.com

Changes in v2:

Copy link

gitgitgadget bot commented Jul 9, 2025

There are issues in commit aa39f20:
last-modified: use Bloom filters when available
Lines in the body of the commit messages should be wrapped between 60 and 76 characters.
Indented lines, and lines without whitespace, are exempt

@To1ne
Copy link
Author

To1ne commented Jul 9, 2025

Let's try CI here, because I'm seeing unreproducable errors on GitLab CI.

To1ne added 3 commits July 9, 2025 16:47
Similar to git-blame(1), introduce a new subcommand
git-last-modified(1). This command shows the most recent modification to
paths in a tree. It does so by expanding the tree at a given commit,
taking note of the current state of each path, and then walking
backwards through history looking for commits where each path changed
into its final commit ID.

Based-on-patch-by: Jeff King <peff@peff.net>
Improved-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
This just runs some simple last-modified commands. We already test
correctness in the regular suite, so this is just about finding
performance regressions from one version to another.

Based-on-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Toon Claes <toon@iotcl.com>
Our 'git last-modified' performs a revision walk, and computes a diff at
each point in the walk to figure out whether a given revision changed
any of the paths it considers interesting.

When changed-path Bloom filters are available, we can avoid computing
many such diffs. Before computing a diff, we first check if any of the
remaining paths of interest were possibly changed at a given commit by
consulting its Bloom filter. If any of them are, we are resigned to
compute the diff.

If none of those queries returned "maybe", we know that the given commit
doesn't contain any changed paths which are interesting to us. So, we
can avoid computing it in this case.

Comparing the perf test results on git.git:

    Test                                        HEAD~             HEAD
    ------------------------------------------------------------------------------------
    8020.1: top-level last-modified             4.49(4.34+0.11)   2.22(2.05+0.09) -50.6%
    8020.2: top-level recursive last-modified   5.64(5.45+0.11)   5.62(5.30+0.11) -0.4%
    8020.3: subdir last-modified                0.11(0.06+0.04)   0.07(0.03+0.04) -36.4%

Based-on-patch-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
@To1ne To1ne force-pushed the b4/toon-new-blame-tree branch from aa39f20 to c739a7d Compare July 9, 2025 14:47
@dscho
Copy link
Member

dscho commented Jul 10, 2025

@To1ne seems that I can reproduce the test failure here:

$ (export GIT_TEST_COMMIT_GRAPH=1;         export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1;         make -j15 && cd t && ./t8020-*.sh -ivx)
[...]
ok 10 - last-modified from subdir uses relative pathspecs

expecting success of 8020.11 'limit last-modified traversal by count':                                                          check_last_modified -1 <<-\EOF
        3 a
        ^2 file
        EOF

+ check_last_modified -1
+ local indir=
+ test 1 != 0                                                                                                           + break
+ cat
+ test_when_finished rm -f tmp.*
+ test 0 = 0
+ test_cleanup={ rm -f tmp.*
                } && (exit "$eval_ret"); eval_ret=$?; :
+ git last-modified -1
+ git name-rev --annotate-stdin --name-only --tags
+ tr \t
+ sort tmp.3
+ test_cmp expect actual
+ test 2 -ne 2
+ eval diff -u "$@"
+ diff -u expect actual
--- expect      2025-07-10 07:32:55.146852469 +0000
+++ actual      2025-07-10 07:32:55.158851905 +0000
@@ -1,2 +1 @@
 3 a
-^2 file
error: last command exited with $?=1
not ok 11 - limit last-modified traversal by count
#
#               check_last_modified -1 <<-\EOF
#               3 a
#               ^2 file
#               EOF
#
1..11

Maybe you found a bug in the GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS code?

@To1ne
Copy link
Author

To1ne commented Jul 10, 2025

Maybe you found a bug in the GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS code?

@dscho Thank you for your input, but I'm not sure there is.

If I regenerate commit-graphs in the test repo with git commit-graph write --changed-paths --reachable, I'm still getting:

$ git last-modified -1
664d1218d7e3d3f48a5b61104b0532c7635dc5e5	a

TBH, I've got no idea where to look now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载