Prevent over-allocation for broadcasted outputs of sum_to #699

nkoppel · 2023-04-12T15:18:09Z

If the output of sum_to is meant to be broadcasted, the current cuda implementation allocates, and does not use, memory for the full unbroadcasted tensor.

nkoppel and others added 2 commits April 12, 2023 10:16

prevent over-allocation for broadcasted outputs of sum_to

e0da106

Merge branch 'main' into sum_allocation

527b54b

coreylowman merged commit a08b4a7 into coreylowman:main Apr 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Prevent over-allocation for broadcasted outputs of sum_to #699

Prevent over-allocation for broadcasted outputs of sum_to #699

Uh oh!

nkoppel commented Apr 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Prevent over-allocation for broadcasted outputs of sum_to #699

Prevent over-allocation for broadcasted outputs of sum_to #699

Uh oh!

Conversation

nkoppel commented Apr 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants