这是indexloc提供的服务,不要输入任何密码
Skip to content

Use blas[accelerate] to take advantage of Apple Silicon GPU. #318

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 16, 2025

Conversation

oursland
Copy link
Contributor

No description provided.

@oursland
Copy link
Contributor Author

Resolves #317.

@gerlero
Copy link
Owner

gerlero commented Feb 13, 2025

Any way to test that the change is actually working?

@gerlero
Copy link
Owner

gerlero commented Feb 13, 2025

From #317

As noted on this SO comment, Accelerate is slightly slower on Intel macOS than OpenBLAS, which is the current version used. Using MKL BLAS on Intel macOS platforms would improve their performance considerably.

Then maybe the change should be conditional on the platform (i.e. Accelerate only enabled on Apple silicon)?

@oursland
Copy link
Contributor Author

I can do some timing on a simulation I'm working on, however it is with 2312, which will have to built separately.

@oursland
Copy link
Contributor Author

Baseline OpenFOAM v2312 simulation: 15m23s

image

Now to rebuild OpenFOAM v2312 with blas[accelerate] and retest.

@oursland
Copy link
Contributor Author

With blas[accelerate] I get 8m36s.

image

@oursland
Copy link
Contributor Author

That speedup was too substantial, so I re-ran the simulation 3 times with OpenBLAS and 3 times with Accelerate BLAS:

Run OpenBLAS Accelerate
1 9m32s 8m51s
2 9m18s 9m15s
3 9m08s 8m14s
--- --- ---
AVG 9m20s 8m57s

Time reduction: 5.85%

Also sets blas[blis], which should offer some improvement on Intel platforms.
@oursland
Copy link
Contributor Author

I have updated the PR to use platform-specific BLAS implementations. Accelerate is used on the Apple Silicon platforms and BLIS was selected for Intel as conda-forge does not have a MKL distribution of BLAS for macOS.

@gerlero
Copy link
Owner

gerlero commented Feb 13, 2025

Time reduction: 5.85%

Thanks for the work! I'm fine with doing the change then.

BLIS was selected for Intel

Any reason why not to leave it to the whatever is the default in conda on that platform?

@oursland
Copy link
Contributor Author

The rationale is that in the SO comment referenced above there was a performance improvement of 20.9% when selecting BLIS over OpenBLAS.

@gerlero
Copy link
Owner

gerlero commented Feb 14, 2025

The rationale is that in the SO comment referenced above there was a performance improvement of 20.9% when selecting BLIS over OpenBLAS.

Great. Then it makes sense to me that we make that change too.

@gerlero gerlero merged commit 53f7de4 into gerlero:main Mar 16, 2025
10 checks passed
@gerlero
Copy link
Owner

gerlero commented Mar 16, 2025

@oursland Thanks for the work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants