+
Skip to content

Conversation

nkoppel
Copy link
Contributor

@nkoppel nkoppel commented Apr 12, 2023

If the output of sum_to is meant to be broadcasted, the current cuda implementation allocates, and does not use, memory for the full unbroadcasted tensor.

@coreylowman coreylowman merged commit a08b4a7 into coreylowman:main Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载