MultiDrawIndirect feature and maxDrawIndirectCount limit

## Difference from #1354

This issue is only about supporting the Vulkan 1.0 equivalent of VkPhysicalDeviceFeatures::multiDrawIndirect and VkPhysicalDeviceLimits::maxDrawIndirectCount, and is not a duplicate of https://github.com/gpuweb/gpuweb/issues/1354 which mixes in discussions about vkCmdDrawIndexedIndirectCount which is a Vulkan 1.2 feature.

## Vulkan

Here is a few measurements of a ported application which put ~500000 draw calls in an indirect buffer and calls vkCmdDrawIndexedIndirect once if VkPhysicalDeviceFeatures::multiDrawIndirect is supported, and repeatedly calls vkCmdDrawIndexedIndirect with incrementing offsets if VkPhysicalDeviceFeatures::multiDrawIndirect is not supported:

```
vkPhysicalDeviceFeatures.multiDrawIndirect = true:
5ms (200 FPS)
vkPhysicalDeviceFeatures.multiDrawIndirect = false:
18ms (55 FPS)
```

During development we also run with validation layers on in Vulkan which increases the overhead significally when vkPhysicalDeviceFeatures.multiDrawIndirect is not enabled:

```
vkPhysicalDeviceFeatures.multiDrawIndirect = true and validation = true:
6ms (166 FPS)
vkPhysicalDeviceFeatures.multiDrawIndirect = false and validation = true:
2600ms (<1 FPS)
```

This all makes sense, we do 500000 times as many draw api calls to vkCmdDrawIndexedIndirect when vkPhysicalDeviceFeatures.multiDrawIndirect is set to false, and validation would run for each of those calls.

## WebGPU

This measurement is from the same application state but running WebGPU natively through Dawn with its Vulkan backend, and we have to call wgpuRenderPassEncoderDrawIndexedIndirect 500000 times since we don't have a drawCount argument:

```
wgpuRenderPassEncoderDrawIndexedIndirect:
57ms (17 FPS)
```

If there was zero overhead from the wgpuRenderPassEncoderDrawIndexedIndirect call down to the vkCmdDrawIndexedIndirect call we would get about 55 FPS as seen in the Vulkan output from above, so the difference here is the overhead of Dawn. This overhead would supposedly be a lot lower if there was only a single call to wgpuRenderPassEncoderDrawIndexedIndirect with the addition of a drawCount argument.

The native Vulkan implementation with validation on is about 9.5 times faster than WebGPU, and WebGPU is running natively here on Dawn with the Vulkan backend which has support for vkPhysicalDeviceFeatures.multiDrawIndirect, but there is no way to use that today from WebGPU. This is where I believe we can make a smaller adjustment to the WebGPU to make a big impact on performance.

And if a developer accidentally leaves his Vulkan validation layers on from running the native Vulkan implementation when running the WebGPU backend:

```
wgpuRenderPassEncoderDrawIndexedIndirect:
3200ms (<1 FPS)
```

We get both the validation from WebGPU and Vulkan, and this Vulkan validation overhead would also be a lot lower (as seen from the native Vulkan measurements above) if there was only a single call to wgpuRenderPassEncoderDrawIndexedIndirect with the addition of a drawCount argument which would then be translated to a single call to vkCmdDrawIndexedIndirect internally in Dawn.

## RenderDoc

We often use RenderDoc during development, both for our native Vulkan implementation and when using WebGPU natively through Dawn. Capturing a frame using the Vulkan backend that supports vkPhysicalDeviceFeatures.multiDrawIndirect can be done in less than two seconds and opens just as fast, but capturing a frame using the native WebGPU dawn backend takes several minutes. We develop on machines supporting vkPhysicalDeviceFeatures.multiDrawIndirect, so it's impractical to use RenderDoc debugging with WebGPU for us as making captures just takes to long.

## Suggestion

Add WGPUFeatureName_MultiDrawIndirect.

Add WGPULimits::maxDrawIndirectCount.

Add a uint32_t drawCount argument to wgpuRenderPassEncoderDrawIndexedIndirect, and document it something like this:
```
drawCount is the number of draws to execute, and can be zero.

If the WGPUFeatureName_MultiDrawIndirect feature is not enabled, drawCount must be 0 or 1

drawCount must be less than or equal to WGPULimits::maxDrawIndirectCount
```

This mimics VkPhysicalDeviceFeatures::multiDrawIndirect, VkPhysicalDeviceLimits::maxDrawIndirectCount, and the documentation of vkCmdDrawIndexedIndirect from Vulkan 1.0.

This could also be polyfilled easely by implementations even if there is no support internally, by looping over the draw count and increasing the offset like Vulkan users have to do today if vkPhysicalDeviceFeatures.multiDrawIndirect is not supported, if you want to consider skipping the feature and limits entirely and just add the drawCount argument to wgpuRenderPassEncoderDrawIndexedIndirect.

I believe that discussing a MultiDrawIndirect feature and maxDrawIndirectCount limit addition first (and not bringing up the Vulkan 1.2 equivalent vkCmdDrawIndexedIndirectCount) would be more productive to get progress for a more usable indirect experience for porting Vulkan 1.0 applications to WebGPU.

Edit:

Having WGPUFeatureName_IndirectFirstInstance but not WGPUFeatureName_MultiDrawIndirect is also very confusing for someone coming from Vulkan. I would have expected to have both or none, as having a non-zero firstInstance and greater-than-one drawCount is both optional features in implementations. Now we have a way to use non-zero firstInstance if supported by implementations, but no way to use greater-than-one drawCount even if supported by implementations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MultiDrawIndirect feature and maxDrawIndirectCount limit #5175

Difference from #1354

Vulkan

WebGPU

RenderDoc

Suggestion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MultiDrawIndirect feature and maxDrawIndirectCount limit #5175

Description

Difference from #1354

Vulkan

WebGPU

RenderDoc

Suggestion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions