-
Notifications
You must be signed in to change notification settings - Fork 345
Description
Introduction
“Storage-texture” is a binding type defined in WebGPU Specification. This binding type allows performing texture reads without sampling and store to arbitrary positions in shaders. This report will discuss the implementation details to support Storage Textures in WebGPU.
Related Features
Texture Usage
D3D12, Metal and Vulkan all require that the texture must be created with a proper usage before it can be used as a storage texture.
On D3D12, when we want to use the texture as a read-only storage texture, a Shader Resource View (SRV) will be enough. When we want to write to a storage texture, we need to create an Unordered Access View (UAV) on it, which requires the texture be created with the flag D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS. The D3D12 document suggests “applications should avoid setting this flag when unordered access operations will never occur”. UAVs are allocated in the same type of descriptor heap as SRVs (CBV/SRV/UAV Heap).
On Metal, the option MTLTextureUsageShaderRead is required when we want to access the given texture with a read() or sample() function in any shader. When we want to access the texture with load() function, we need to specify the option MTLTextureUsageShaderWrite when we create the texture.
On Vulkan, we are required to set VK_IMAGE_USAGE_STORAGE_BIT to specify that the image can be used to create a VkImageView suitable for occupying a VkDescriptorSet slot of type VK_DESCRIPTOR_TYPE_STORAGE_IMAGE.
Texture Format
D3D12, Metal and Vulkan all have the minimum requirements on the texture formats that support storage textures. Here we will mainly talk about the supports of writable storage textures and read-write storage textures on each texture color format.
Writable Storage Textures
The supports of writable storage textures on D3D12, Metal and Vulkan on the texture color formats that are required in current WebGPU SPEC are summarized in the following table.
| # | WebGPU Texture Formats | D3D12 (Typed UAV Store) | Metal (Writable) | Vulkan (VK_IMAGE_USAGE_STORAGE_BIT) |
|---|---|---|---|---|
| 1 | r8unorm | Supported | Supported | Optional |
| 2 | r8snorm | Supported | Supported | Optional |
| 3 | r8uint | Supported | Supported | Optional |
| 4 | r8sint | Supported | Supported | Optional |
| 5 | r16uint | Supported | Supported | Optional |
| 6 | r16sint | Supported | Supported | Optional |
| 7 | r16float | Supported | Supported | Optional |
| 8 | rg8unorm | Supported | Supported | Optional |
| 9 | rg8snorm | Supported | Supported | Optional |
| 10 | rg8uint | Supported | Supported | Optional |
| 11 | rg8sint | Supported | Supported | Optional |
| 12 | r32uint | Supported | Supported | Supported |
| 13 | r32sint | Supported | Supported | Supported |
| 14 | r32float | Supported | Supported | Supported |
| 15 | rg16uint | Supported | Supported | Optional |
| 16 | rg16sint | Supported | Supported | Optional |
| 17 | rg16float | Supported | Supported | Optional |
| 18 | rgba8unorm | Supported | Supported | Supported |
| 19 | rgba8unorm-srgb | Supported | Not supported on A7 and all Mac | Optional |
| 20 | rgba8snorm | Supported | Supported | Supported |
| 21 | rgba8uint | Supported | Supported | Supported |
| 22 | rgba8sint | Supported | Supported | Supported |
| 23 | bgra8unorm | Supported | Supported | Optional |
| 24 | bgra8unorm-srgb | Supported | Not supported on A7 and all Mac | Optional |
| 25 | rgb10a2unorm | Supported | Not supported on A7 and A8 | Optional |
| 26 | rg11b10float | Supported | Not supported on A7 and A8 | Optional |
| 27 | rg32uint | Supported | Supported | Supported |
| 28 | rg32sint | Supported | Supported | Supported |
| 29 | rg32float | Supported | Supported | Supported |
| 30 | rgba16uint | Supported | Supported | Supported |
| 31 | rgba16sint | Supported | Supported | Supported |
| 32 | rgba16float | Supported | Supported | Supported |
| 33 | rgba32uint | Supported | Supported | Supported |
| 34 | rgba32sint | Supported | Supported | Supported |
| 35 | rgba32float | Supported | Supported | Supported |
Read-Write Storage Textures
D3D12 and Metal have special requirements on the texture formats that support both read and write in one shader.
D3D12 devices that support feature level 11_0 are required to support UAV Load on R32_FLOAT, R32_UINT and R32_SINT.
Metal supports “Texture ReadWrite” since Metal 1.2. On iOS 11+ and macOS 10.13+ we can query the Tier of the support of “ReadWrite Texture” with MTLDevice.readWriteTextureSupport.
The supports of read-write storage textures for all the texture color formats in the current WebGPU on D3D12, Metal and Vulkan are listed here:
| # | WebGPU Texture Formats | D3D12 (Typed UAV Load) | Metal (Read/Write) | Vulkan (VK_IMAGE_USAGE_STORAGE_BIT) |
|---|---|---|---|---|
| 1 | r8unorm | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Optional |
| 2 | r8snorm | Optional | Not Supported | Optional |
| 3 | r8uint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Optional |
| 4 | r8sint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Optional |
| 5 | r16uint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Optional |
| 6 | r16sint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Optional |
| 7 | r16float | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Optional |
| 8 | rg8unorm | Optional | Not Supported | Optional |
| 9 | rg8snorm | Optional | Not Supported | Optional |
| 10 | rg8uint | Optional | Not Supported | Optional |
| 11 | rg8sint | Optional | Not Supported | Optional |
| 12 | r32uint | Supported | OSX_GPUFamily1_v2, MTLReadWriteTextureTier1 | Supported |
| 13 | r32sint | Supported | OSX_GPUFamily1_v2, MTLReadWriteTextureTier1 | Supported |
| 14 | r32float | Supported | OSX_GPUFamily1_v2, MTLReadWriteTextureTier1 | Supported |
| 15 | rg16uint | Optional | Not Supported | Optional |
| 16 | rg16sint | Optional | Not Supported | Optional |
| 17 | rg16float | Optional | Not Supported | Optional |
| 18 | rgba8unorm | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Supported |
| 19 | rgba8unorm-srgb | Optional | Not Supported | Optional |
| 20 | rgba8snorm | Optional | Not Supported | Supported |
| 21 | rgba8uint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Supported |
| 22 | rgba8sint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Supported |
| 23 | bgra8unorm | Optional | Not Supported | Optional |
| 24 | bgra8unorm-srgb | Optional | Not Supported | Optional |
| 25 | rgb10a2unorm | Optional | Not Supported | Optional |
| 26 | rg11b10float | Optional | Not Supported | Optional |
| 27 | rg32uint | Optional | Not Supported | Supported |
| 28 | rg32sint | Optional | Not Supported | Supported |
| 29 | rg32float | Optional | Not Supported | Supported |
| 30 | rgba16uint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Supported |
| 31 | rgba16sint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Supported |
| 32 | rgba16float | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Supported |
| 33 | rgba32uint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Supported |
| 34 | rgba32sint | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Supported |
| 35 | rgba32float | FeatureData.TypedUAVLoadAdditionalFormats | OSX_ReadWriteTextureTier2, MTLReadWriteTextureTier2 | Supported |
Sample Count
D3D12 requires when we set D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS, the sample count must be 1.
Metal requires multisampled textures only support “read” attribute in Metal Shading Languages (Chapter 2.8, “Textures”).
Vulkan SPEC requires if the multisampled storage image feature (“shaderStorageImageMultisample”) is not enabled, and the usage contains VK_IMAGE_USAGE_STORAGE_BIT, samples must be VK_SAMPLE_COUNT_1_BIT. The coverage of the Vulkan feature “shaderStorageImageMultisample” is 67%.
Resource Bindings
D3D12 defines a descriptor range type D3D12_DESCRIPTOR_RANGE_TYPE_UAV for UAVs in the root signature.
Metal treats the textures used as storage textures the same as sampled textures (all of them should be set in the related Argument tables). For Metal Argument Buffers, Metal defines two Tiers for Argument Buffers and writable textures are only supported in Tier 2.
Vulkan defines the descriptor type VK_DESCRIPTOR_TYPE_STORAGE_IMAGE for storage images. If descriptorType is VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, the imageView member of each element of pImageInfo must have been created with VK_IMAGE_USAGE_STORAGE_BIT set.
Shader Stages
On D3D12 I fail to find any restrictions to limit the use of UAVs in any shader stages. On D3D11, with feature level 11_0, UAVs can only be used in the pixel shaders and compute shaders, and with feature level 11_1+, UAVs can be used in all shader stages.
Metal supports MTLTextureUsageShaderRead in any shader stage, and MTLTextureUsageShaderWrite only in compute shaders. Since OSX_GPUFamily1_v2, both vertex and fragment functions can now write to textures.
Vulkan SPEC requires storage image loads must be supported in all shader stages, and stores to storage images in compute shaders.
- The Vulkan feature “
fragmentStoresAndAtomics” specifies whether storage buffers and images support stores and atomic operations in the fragment shader stage. This feature has a coverage of 99%. - The Vulkan feature “
vertexPipelineStoresAndAtomics” specifies whether storage buffers and images support stores and atomic operations in the vertex, tessellation, and geometry shader stages. This feature has a coverage of 85%.
Shader Operations
Sample, Load and Store
On D3D12, according to HLSL documents, Sample() can only be allowed on read-only texture objects (Texture1D, Texture2D, etc), and on writable texture objects (RWTexture2D, RWTexture2DArray, etc) it is not allowed to call Sample(). Both read-only and writable texture objects support Load() operation.
On Metal Shading Language (Chapter 2.8), “sample” and “read” are different “access” attributes. "sample" implies the ability to read from a texture with and without a sampler, and “read” implies without a sampler, a graphics or kernel function can only read the texture object.
On Vulkan, According to the definition of OpTypeImage in SPIR-V, “Sampled” indicates whether or not this image will be accessed in combination with a sampler. In the Vulkan execution environment, OpTypeImage must have a “Sampled” operand of 1 (sampled image) or 2 (storage image). “storage image” and “sampled image” have different usages in SPIR-V. SPIR-V provides OpImageRead to read a texel from an image without a sampler and OpImageWrite to write a texel to an image without a sampler. Both of these two instructions require the operand “Image” must be an object whose type is OpTypeImage with a “Sampled” operand of 2.
Atomic Functions
On D3D12, feature level 11_0 devices support atomic operations (UAV Atomic Exchange, UAV Atomic Signed Min/Max, UAV Atomic Unsigned Min/Max, UAV Atomic Add, UAV Atomic Bitwise Ops and UAV Atomic Cmp&Store/ Cmp&Exch) on R32_UINT and R32_SINT.
On Metal Shading Language (Chapter 6.13), atomic functions are only allowed on Metal atomic data, which does not include writable textures.
On Vulkan, the image atomic functions are supported on the formats with VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT. Vulkan SPEC (Table 65) requires VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT must be supported on VK_FORMAT_R32_UINT and VK_FORMAT_R32_SINT.
Resource Limits
On D3D12, the resource limits about UAVs are defined together with Hardware Tiers. The maximum number of UAVs in all descriptor tables across all stages are listed as follows:
- Tier 1: 64 for feature levels 11_1+, 8 for feature level 11_0
- Tier 2: 64
- Tier 3: full heap
According to the Metal document Metal-Feature-Set-Tables, the maximum number of entries in the texture argument table, per graphics or compute function are listed here:
MTLGPUFamilyApple1,MTLGPUFamilyApple2,MTLGPUFamilyApple3(A7 - A10): 31MTLGPUFamilyApple4,MTLGPUFamilyApple5(A11, A12): 96MTLGPUFamilyApple6,MTLGPUFamilyMac1,MTLGPUFamilyMac2: 128
On Vulkan, the minimum required resource limits that are related to storage images in Vulkan SPEC are listed as follows:
maxPerStageDescriptorStorageImages(4)maxDescriptorSetStorageImages(4 * 6, 6 is the number of shader stages)maxFragmentCombinedOutputResources(4)
maxFragmentCombinedOutputResourcesis the total number of storage buffers, storage images and output buffers which can be used in the fragment stage.
Resource Barriers
On D3D12 there are two types of barriers that are related to UAVs:
- Transition Barrier (
D3D12_RESOURCE_STATE_UNORDERED_ACCESS): A subresource must be in this state when it is accessed by the 3D pipeline via UAV. - UAV barrier (
D3D12_RESOURCE_UAV_BARRIER): indicate all UAV accesses (read or write) to a particular resource must complete before any future UAV accesses (read or write) can begin.
On Metal devices that support OSX_GPUFamily1_v2, it is guaranteed that:
- Between Command Encoders, all resource writes performed in a given command encoder are visible in the next command encoder. This is true for both render and compute command encoders.
- Within a Render Command Encoder: for textures, the
textureBarrier(deprecated, only available until macOS 10.14, useMTLCommandEncoder.memoryBarrierWithScopesince macOS 10.14) method ensures that writes performed in a given draw call are visible to subsequent reads in the next draw call. - Within a Compute Command Encoder: all resource writes performed in a given kernel function are visible in the next kernel function.
Vulkan defines image memory barriers that are only apply to memory accesses involving a specific image subresource range.
- Image memory barriers can also be used to define image layout transitions or a queue family ownership transfer for the specified image subresource range.
- Vulkan SPEC requires if descriptorType is
VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, for each descriptor that will be accessed via load or store operations theimageLayoutmember for corresponding elements of pImageInfo must beVK_IMAGE_LAYOUT_GENERAL.
Besides, Vulkan SPEC has severe restrictions to use image memory barriers in a render pass instance:
- If
vkCmdPipelineBarrieris called within a render pass instance, theoldLayoutandnewLayoutmembers of any element ofpImageMemoryBarriersmust be equal to the layout member of an element of thepColorAttachments,pResolveAttachmentsorpDepthStencilAttachmentmembers of theVkSubpassDescriptioninstance that the current subpass was created with, that refers to the same image. - If
vkCmdPipelineBarrieris called within a render pass instance, theoldLayoutandnewLayoutmembers of an element ofpImageMemoryBarriersmust be equal.
Because of these restrictions, the group has agreed to not synchronize individual draw calls within a render pass.
Proposal
Now that we have added "storage-texture" in GPUBindingType and “STORAGE” in GPUTextureUsage, we can just discuss some details on the support of Storage Textures in WebGPU implementations.
-
Textures that are used as writable storage textures cannot be multisampled as it is not allowed in D3D12 and Metal.
-
Maybe it is better to add “
READONLY-STORAGE” as a new enum inGPUTextureUsagebecause with this extra information we can easily know:- Whether we need to set
D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESSwhen creating textures on D3D12. - Whether we need to set
MTLTextureUsageShaderWritewhen creating textures on Metal. - If we allow “
SAMPLED-TEXTURE” being used as “READONLY-STORAGE”, then on Vulkan we have to set bothVK_IMAGE_USAGE_SAMPLED_BITandVK_IMAGE_USAGE_STORAGE_BIT, which may hurt the performance of texture sampling if the texture is actually only used for sampling.
- Whether we need to set
-
Maybe we also need to and “
readonly-storage-texture” as a new type of binding point because we need to know the following information when we create the bind group layouts:- Whether we should use
D3D12_DESCRIPTOR_RANGE_TYPE_UAVwhen creating the root signatures on D3D12. - Whether we should use
VK_DESCRIPTOR_TYPE_STORAGE_IMAGEas thedescriptorTypemember of aVkDescriptorSetLayoutBindingobject used in the creation of Vulkan graphics pipeline asVK_DESCRIPTOR_TYPE_STORAGE_IMAGEandVK_DESCRIPTOR_TYPE_SAMPLED_IMAGEare differentVkDescriptorTypeenums.
- Whether we should use
-
The color texture formats that are allowed to be writable storage textures on D3D12, Metal and Vulkan are summarized in the previous tables.
-
The support of Read-Write storage textures has to be an extension because it is only supported on macOS 10.12+ and iOS 11+.
-
Readable storage textures can be supported in all shader stages, and writable storage textures can only be supported in compute shaders.
- The support of writable storage textures in fragment shaders has to be an extension as it is only available on macOS 10.12+.
- The support of writable storage textures in vertex shaders has to be an extension as it requires D3D feature level 11_1+, macOS 10.12+ and Vulkan feature “
vertexPipelineStoresAndAtomics”
-
We suggest the maximum number of storage images is 4 as it is following the resource limits in Vulkan, which is the strictest among D3D12, Metal and Vulkan.
-
We cannot support image atomic functions because this feature cannot be supported on Metal.