-
Notifications
You must be signed in to change notification settings - Fork 344
Description
There's currently no way to fill a buffer from the device using a hardware-supported DMA (or driver-optimized) fill operation. A fill operation is available in practically all APIs (MTLBlitCommandEncoder fillBuffer:range:value:, vkCmdFillBuffer, ID3D12GraphicsCommandList::ClearUnorderedAccessViewUnit - example, etc), and when not it'd be significantly more efficient for the WebGPU implementation to emulate it instead of user code restricted to the WebGPU API.
When emulating fill with a shader in user-mode the hardware is unable to take advantage of dedicated DMA resources that may be available and what should be a relatively cheap accelerated memset instead occupies compute units. It's also much more complex for user code to build new bind groups and issue multiple commands when performing the fill vs. a single GPUCommandEncoder fillBuffer that specified just the destination buffer, offset, size, and value inline. Due to the lack of push constants it also means that a uniform buffer update is required to shuffle the fill value to the shader which can introduce significant complexity when trying to emulate a fillBuffer in middleware as even more bind groups are required, etc. Even worse if each fill range has a different size (they often do) then either new unique bind groups are required that specify the size (more API overhead) or one must use a single bind group that specifies the whole buffer with the size passed through uniforms - which without explicit barriers means two fills to subranges of the same buffer (or a fill and a dispatch, etc) will have a false dependency.
It'd be great to see this addition to the API to make it easier to tunnel through to the native APIs that expose this functionality as well as more efficiently utilize hardware resources. AFAICS the same reason one has GPUCommandEncoder copyBufferToBuffer instead of emulating that with a shader applies here.
Proposal:
interface GPUCommandEncoder {
+ undefined fillBuffer(GPUBuffer buffer, GPUSize64 offset, GPUSize64 length, unsigned long value);
}