这是indexloc提供的服务,不要输入任何密码
Skip to content

perf: Switch from forEach to regular for-loops for about 30% improvement in raw update performance #3472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 5, 2025

Conversation

filiph
Copy link
Contributor

@filiph filiph commented Feb 5, 2025

Description

Replaces uses of .forEach() with regular for loops. This has significant impact on performance in hot paths such as Component.updateTree() and Component.renderTree().

Updating Components

In the graph above, you see 50 runs of benchmark/update_components_benchmark.dart. The forEach results are blue, the for-loop results are green. I could see this effect after just replacing the forEach calls in component.dart. Data here.

Aside for posterity: for i in {1..50}; do flutter test benchmark/main.dart --no-pub -r silent 2>/dev/null >> benchmarks_for_loop.txt; done, then get the data from the text file.

I went ahead and replaced additional forEach calls elsewhere in the engine codebase, but there was no additional effect on the benchmark. Still, I kept those changes in. I only replaced forEach in places that seemed relatively hot (e.g. ComponentSet.reorder()). There are more forEach calls in the codebase but those seem fine to me as they aren't likely to be called too often.

It should be noted that I needed to update the benchmark to add children to the components. Every _BenchmarkComponent now has 10 children. This feels a bit more realistic use of the framework than having a flat array of components with no children. By changing the benchmark code in this way, I made it a bit slower, so I'm not sure if the effect will be seen in the CI/CD.

I also tried whether the change will have effect on my game's benchmark (which is a lot more involved and uses flutter driver to test the whole game in AOT mode). For the game, the effect is negligible but that was kind of expected since my game spends a significant amount of its CPU time on AI, raycasting, smoke simulation and drawVertices, none of which really depend on the speed of the engine update() mechanism.

Checklist

  • I have followed the Contributor Guide when preparing my PR.
  • I have updated/added tests for ALL new/updated/fixed functionality.
  • I have updated/added relevant documentation in docs and added dartdoc comments with ///.
  • I have updated/added relevant examples in examples or docs.

Breaking Change?

  • Yes, this PR is a breaking change.
  • No, this PR is not a breaking change.

Related Issues

@filiph
Copy link
Contributor Author

filiph commented Feb 5, 2025

Ok, it looks like the benchmark GitHub action failed with Error: + Failed to create or update comment on GitHub.: Resource not accessible by integration - https://docs.github.com/rest/issues/comments#create-an-issue-comment.

Here are the results from the job run:

Parsed benchmark: Render Components Benchmark -> 1165.881118881119
Parsed benchmark: Updating Components Benchmark -> 254720.125

@spydon
Copy link
Member

spydon commented Feb 5, 2025

Ok, it looks like the benchmark GitHub action failed with Error: + Failed to create or update comment on GitHub.: Resource not accessible by integration - https://docs.github.com/rest/issues/comments#create-an-issue-comment.

It can only run for members of the organization unfortunately.

Copy link
Member

@erickzanardo erickzanardo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nit, but LGTM already, Thanks for this!

Copy link
Member

@spydon spydon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm! I thought we had already done this... I think there is a DCM rule that we can enable that checks for these.

I hope one day the compiler will be able to make forEach et. al as efficient as normal loops.

@erickzanardo
Copy link
Member

Lgtm! I thought we had already done this... I think there is a DCM rule that we can enable that checks for these.

I hope one day the compiler will be able to make forEach et. al as efficient as normal loops.

I don't think we need DCM for this, I am pretty sure I saw a lint rule for that already somewhere...I will take a look for it

@spydon spydon enabled auto-merge (squash) February 5, 2025 13:55
@spydon spydon merged commit 9891f4e into flame-engine:main Feb 5, 2025
9 checks passed
@filiph
Copy link
Contributor Author

filiph commented Feb 5, 2025

Thanks for the quick turnaround!

FWIW, I don't think enabling a lint and then converting each and every forEach call into a for loop is something you'd want. The effect is only felt in very hot paths. In this PR, the only real performance upside came from the two lines in Component. Elsewhere, the effect is negligible. (Unless you're doing something like resizing your game every frame.) And it's kind of nice to be able to express things in a one-liner.

@filiph filiph deleted the perf/for-loop branch February 5, 2025 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants