Proposal of FP16 WGSL extension

# 1. Introduction

In this issue we propose an extension to enable FP16, the 16-bit floating point types, to be allowed in WGSL for proposed usages. To keep the issue in a reasonable length, detailed information of each backend’s (D3D12, Metal, Vulkan, and OpenGL) support for each usage is omitted, and can be found in a [more detailed proposal design doc](https://docs.google.com/document/d/1TdHVV8NgQYANAq967L_rRHdJ7Uq_L8CGvq7_VY6G4OQ) for this extension.

Although other low-bit-width data types, e.g. INT16 and INT8, may also be supported by backends, we focus on FP16 in this extension. 

# 2. Proposal usages

This proposed extension, tentatively named `fp16-in-shaders-and-storage`, enables certain usages of FP16 in WGSL.

The considered usages of FP16 in WGSL and their support of different backends can be summarized in the table below. We propose to enable all usages except atomic FP16, using in <code>uniform</code> storage class and pipeline input/output in this extension, as a tradeoff between functionality and capability (some Vulkan devices support using FP16 types within shaders and storage buffer, but don’t support using them in uniform buffer). Usages of <code>uniform</code> storage class and pipeline input/output can also be emulated on devices that don’t support them natively. If such emulation is enabled, we can enable all usages except atomic FP16 in this extension, making this extension more general. Otherwise, the <code>uniform</code> storage class or pipeline input/output usages may be enabled by other extensions.


<table>
 <tr>
 <td>WGSL usage
 </td>
 <td>Vulkan (SPIR-V)
 </td>
 <td>D3D12 (HLSL)
 </td>
 <td>Metal (MSL)
 </td>
 <td>OpenGL (GLSL)
 </td>
 </tr>
 <tr>
 <td>Defining FP16 scalar, vector and matrix types
 </td>
 <td rowspan="3" >Requiring at least one Vulkan feature listed in this column
 </td>
 <td rowspan="3" >Require SM6.2 or higher, using DXIL, and <code>Native16BitShaderOpsSupported == true</code>
 </td>
 <td rowspan="3" >Support
 </td>
 <td rowspan="3" >Requiring extension <code><a href="https://github.com/KhronosGroup/OpenGL-Registry/blob/main/extensions/AMD/AMD_gpu_shader_half_float.txt">AMD_gpu_shader_half_float</a></code>
 </td>
 </tr>
 <tr>
 <td>Defining array and structure types with FP16
 </td>
 </tr>
 <tr>
 <td>Conversion and bit-casting between FP16 and other types
 </td>
 </tr>
 <tr>
 <td>Atomic type/operation of FP16
 </td>
 <td colspan="4" >No support
 </td>
 </tr>
 <tr>
 <td>Using variables of FP16 types in <code>private</code>, <code>function</code> and <code>workgroup</code> storage classes
 </td>
 <td rowspan="3" >Require Vulkan feature <code>shaderFloat16 </code>(overall support rate 39.6%, reported by <a href="https://vulkan.gpuinfo.org/listfeaturesextensions.php">gpuinfo.org</a>, the same for others in this column)
 </td>
 <td rowspan="6" >Require <a href="https://github.com/microsoft/DirectXShaderCompiler/wiki/16-Bit-Scalar-Types">SM6.2</a> or higher, using DXC, and <code><a href="https://docs.microsoft.com/en-us/windows/win32/api/d3d12/ns-d3d12-d3d12_feature_data_d3d12_options4">Native16BitShaderOpsSupported</a> == true</code>
 </td>
 <td rowspan="6" >Support
 </td>
 <td rowspan="6" >Requiring extension <code><a href="https://github.com/KhronosGroup/OpenGL-Registry/blob/main/extensions/AMD/AMD_gpu_shader_half_float.txt">AMD_gpu_shader_half_float</a></code>
 </td>
 </tr>
 <tr>
 <td>Using FP16 as parameters and results of user-defined value
 </td>
 </tr>
 <tr>
 <td>Using FP16 with built-in function (arithmetical)
 </td>
 </tr>
 <tr>
 <td>Using FP16 types in <code>uniform</code> storage class
 </td>
 <td>Require Vulkan feature <code>uniformAndStorageBuffer16BitAccess </code>(overall support rate 53.1%)
 </td>
 </tr>
 <tr>
 <td>Using FP16 types in <code>Storage</code> storage class
 </td>
 <td>Require Vulkan feature <code>storageBuffer16BitAccess </code>(overall support rate 58.6%)
 </td>
 </tr>
 <tr>
 <td>Using FP16 types in pipeline input/output user-defined variables
 </td>
 <td>Require Vulkan feature <code>storageInputOutput16</code> (overall support rate 24.4%)
 </td>
 </tr>
</table>


Please see the [capability chapter](https://docs.google.com/document/d/1TdHVV8NgQYANAq967L_rRHdJ7Uq_L8CGvq7_VY6G4OQ/edit#heading=h.74yr911zu1fe) in the proposed design doc for detailed investigation of different backends’ support for different usages.

The following sections will describe these usages.

# 3. FP16 data types

FP16 data types are supported with different requirements on each backend.If the backend supports above features, we will be able to enable using FP16 types in WGSL. We also propose to add value conversion and bitcast expressions for FP16.

**Atomic type or operation for FP16 is not allowed**.

## 3.1. Scalar, vector and matrix types of FP16

Currently in WGSL we only have 32-bits floating scalar type, `f32`, floating point vectors and matrices all have component type of `f32`. Here we propose the 16-bit floating point scalar type, `f16`, and corresponding vector and matrix types. If the backend meets the requirement of supporting FP16, the proposed WGSL types may be used in WGSL code.

A brief list of WGSL `f16` types enabled by this extension is shown below. A detailed table for WGSL FP16 types and corresponding backend types is listed in the [appendix](https://docs.google.com/document/d/1TdHVV8NgQYANAq967L_rRHdJ7Uq_L8CGvq7_VY6G4OQ/edit#heading=h.uurjvtaxox4j) of design doc.


<table>
 <tr>
 <td>WGSL Type
 </td>
 <td>Description
 </td>
 </tr>
 <tr>
 <td><code>f16</code>
 </td>
 <td>The <code>f16</code> type is the set of 16-bit floating point values of the IEEE-754 <a href="https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats">binary16</a> (half precision) format.
 </td>
 </tr>
 <tr>
 <td><code>vecN&lt;f16></code>
 </td>
 <td>Vector of <code>N</code> elements of type <code>f16</code>
 </td>
 </tr>
 <tr>
 <td><code>matNxM&lt;f16></code>
 </td>
 <td>Matrix of <code>N</code> columns and <code>M</code> rows
 </td>
 </tr>
</table>

## 3.2. Array and structure with FP16 types

HLSL, MSL, SPIRV, and GLSL all support defining FP16 members in arrays and structures, if FP16 is supported. Therefore, we may enable FP16 types in arrays and structures when FP16 scalar, vector and matrix types are enabled.

Using FP16 types in uniform buffers and storage buffers may have additional memory layout restrictions, please see the corresponding chapters in the proposed design doc([uniform](https://docs.google.com/document/d/1TdHVV8NgQYANAq967L_rRHdJ7Uq_L8CGvq7_VY6G4OQ/edit#heading=h.n4ny73633sp9), [storage](https://docs.google.com/document/d/1TdHVV8NgQYANAq967L_rRHdJ7Uq_L8CGvq7_VY6G4OQ/edit#heading=h.xgm32zv71z5t)).



## 3.3. Type constructor expressions and conversion expressions

This extension proposes the scalar, vector and matrix constructor expressions for `f16` in WGSL in the same ways as `f32`.

We also propose value conversion expressions between `f16` and any other types (`f32`, `u32`, `i32` and `bool`). Value conversion expressions between matrix types (`f16` and `f32` matrices) are introduced in this extension, which do not exist before as there is only `f32` type of matrix in WGSL. Please see the [appendix](https://docs.google.com/document/d/1TdHVV8NgQYANAq967L_rRHdJ7Uq_L8CGvq7_VY6G4OQ/edit#heading=h.uurjvtaxox4j) for a detailed table.

Bit-casting between `f16` and `u32` can be done by packing and unpacking WGSL built-in functions. The most natural bit-casting, between `f16` and `u16`, are not proposed in this extension because we don’t have `u16` types yet. Bit-casting between `f16` and types other than `u16` or `u32` shall not be allowed in WGSL.



## 3.4. Atomic type of FP16

No backend has a useful support of atomic FP16 types and/or operations. Therefore, **we are unlikely going to support atomic type for FP16**. The details for each backend are summarized in the table below.


<table>
 <tr>
 <td>Backend
 </td>
 <td>Details
 </td>
 </tr>
 <tr>
 <td>D3D12
 </td>
 <td>HLSL <a href="https://github.com/microsoft/DirectXShaderCompiler/wiki/16-Bit-Scalar-Types#atomic-operations">does not support</a> atomic operations for float16. 
 </td>
 </tr>
 <tr>
 <td>Vulkan
 </td>
 <td>SPIR-V only supports <code>OpAtomicFAddEXT</code> for 32-bit and 64-bit floating point, requiring <code>AtomicFloat32AddEXT</code> and <code>AtomicFloat64AddEXT</code>. The extension <code><a href="https://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/EXT/SPV_EXT_shader_atomic_float_min_max.html">SPV_EXT_shader_atomic_float_min_max</a></code> adds atomic min and max instruction on floating-point numbers, and the extension <code><a href="https://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/EXT/SPV_EXT_shader_atomic_float16_add.html">SPV_EXT_shader_atomic_float16_add</a></code> proposes supporting atomically adding to 16-bit floating-point numbers in memory. However these are still in internal status.
Vulkan extension <code><a href="https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VK_EXT_shader_atomic_float2.html">VK_EXT_shader_atomic_float2</a></code> (<a href="https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VkPhysicalDeviceShaderAtomicFloat2FeaturesEXT.html">features</a>) suggests 16-bit floating-point atomic operations on buffer and workgroup memory as well as floating-point atomic minimum and maximum operations on buffer, workgroup, and image memory. However, such features are generally unsupported (<a href="https://vulkan.gpuinfo.org/listfeaturesextensions.php">reported</a> 99.8%).
 </td>
 </tr>
 <tr>
 <td>Metal
 </td>
 <td>MSL only supports atomic types of <code>int</code>, <code>uint</code> and <code>bool</code>. 
 </td>
 </tr>
 <tr>
 <td>OpenGL
 </td>
 <td>OpenGL only supports atomic types of <code>int</code>, <code>uint</code> and <code>bool</code>. 

OpenGL only has the atomic counter of integer type.
 </td>
 </tr>
</table>




## 3.5. FP16 literal

In WGSL we have floating suffix `f` for `f32`. Following the GLSL [extension](https://github.com/KhronosGroup/OpenGL-Registry/blob/main/extensions/AMD/AMD_gpu_shader_half_float.txt#L291), we propose the suffix `hf` for `f16`. The form of a numeric literal will be as shown below.


<table>
 <tr>
 <td>Type
 </td>
 <td>form
 </td>
 </tr>
 <tr>
 <td><code>decimal_float_literal</code>
 </td>
 <td><code>/((-?[0-9]*\.[0-9]+|-?[0-9]+\.[0-9]*)((e|E)(\+|-)?[0-9]+)?(h?f)?)|(-?[0-9]+(e|E)(\+|-)?[0-9]+(h?f)?)/</code>
 </td>
 </tr>
 <tr>
 <td><code>hex_float_literal</code>
 </td>
 <td><code>/-?0[x|X]((([0-9a-fA-F]*\.[0-9a-fA-F]+|[0-9a-fA-F]+\.[0-9a-fA-F]*)((p|P)(\+|-)?[0-9]+(h?f)?)?)|([0-9a-fA-F]+(p|P)(\+|-)?[0-9]+(h?f)?))/</code>
 </td>
 </tr>
</table>


Some examples for the proposed FP16 literal are ``-.23E+8hf``, ``123e-4hf``, ``0xABC.DEFhf`` and ``1A2BP-3hf``.

> **_Note_**
> 
> HLSL and MSL use the suffix `h` instead of `hf` for FP16.



# 4. Defining FP16 variables in `private`, `function` and `workgroup` storage class

This extension will enable using FP16 types in defining variables in `private`, `function` and `workgroup` storage classes. If a backend device is suitable to enable this extension, we can directly translate the variable definitions of these storage classes into the backend’s shader code.



# 5. Using FP16 types as function parameter and result

This extension enables using `FP16` types as parameters and results of built-in and user-defined functions, in the same way of using `FP32` types. Some `FP16`-specific built-in functions should be added into WGSL. In this section we will discuss this usage. 



## 5.1. User defined function

All backends support defining functions with 16-bit floating point parameters and/or return values, as long as they support using fp16 in `function` storage class. Therefore, WGSL functions using `fp16` as input/output can map to backends as same as other functions.


## 5.2. Built-in functions

All WGSL built-in functions that take `f32` should support using `f16` as parameters and results, as most floating point built-in functions have half-precision reload on all backends. 

All data packing and unpacking built-in functions should have overloads that take FP16 types as parameters and results where originally `f32` goes. Bit-casting between `f16` and `u32` can be done by these packing and unpacking functions. HLSL seems to have no packing/unpacking functions for FP16, nor bitCast between FP16 and uint32, and we may manually implement them.



# 6. Using FP16 types in `uniform` and `storage` storage class

The variables with `uniform` and `storage` storage classes work as the interactions between gpu and host memory, therefore they have to meet more requirements. We will propose the memory layout requirements for FP16 in this chapter.



## 6.1. Alignment and Size

The following table should be merged into WGSL Spec table `Alignment and size for host-shareable types`.


<table>
 <tr>
 <td>WGSL Type
 </td>
 <td>Alignment (bytes)
 </td>
 <td>Size (bytes)
 </td>
 </tr>
 <tr>
 <td><code>f16</code>
 </td>
 <td>2
 </td>
 <td>2
 </td>
 </tr>
 <tr>
 <td><code>vec2&lt;f16></code>
 </td>
 <td>4
 </td>
 <td>4
 </td>
 </tr>
 <tr>
 <td><code>vec3&lt;f16></code>
 </td>
 <td>8
 </td>
 <td>6
 </td>
 </tr>
 <tr>
 <td><code>vec4&lt;f16></code>
 </td>
 <td>8
 </td>
 <td>8
 </td>
 </tr>
 <tr>
 <td><code>mat2x2&lt;f16></code>
 </td>
 <td>4
 </td>
 <td>8
 </td>
 </tr>
 <tr>
 <td><code>mat2x3&lt;f16></code>
 </td>
 <td>8
 </td>
 <td>16
 </td>
 </tr>
 <tr>
 <td><code>mat2x4&lt;f16></code>
 </td>
 <td>8
 </td>
 <td>16
 </td>
 </tr>
 <tr>
 <td><code>mat3x2&lt;f16></code>
 </td>
 <td>4
 </td>
 <td>12
 </td>
 </tr>
 <tr>
 <td><code>mat3x3&lt;f16></code>
 </td>
 <td>8
 </td>
 <td>24
 </td>
 </tr>
 <tr>
 <td><code>mat3x4&lt;f16></code>
 </td>
 <td>8
 </td>
 <td>24
 </td>
 </tr>
 <tr>
 <td><code>mat4x2&lt;f16></code>
 </td>
 <td>4
 </td>
 <td>16
 </td>
 </tr>
 <tr>
 <td><code>mat4x3&lt;f16></code>
 </td>
 <td>8
 </td>
 <td>32
 </td>
 </tr>
 <tr>
 <td><code>mat4x4&lt;f16></code>
 </td>
 <td>8
 </td>
 <td>32
 </td>
 </tr>
</table>


These values match the MSL and Vulkan Spec. If the size or alignment of structure members is not explicitly defined in WGSL code, the listed value should be used. If explicitly defined, they should satisfy the layout constraints for corresponding storage classes shown below.

> **_Note_**
> 
> The [Internal Layout of Values](https://www.w3.org/TR/WGSL/#internal-value-layout) in WGSL Spec should be also updated.


## 6.2. Using FP16 types in `uniform` storage class

When using FP16 types in `uniform` storage class, besides the alignment requirement above should be met, the extra [buffer layout constraints for uniform](https://www.w3.org/TR/WGSL/#storage-class-layout-constraints) is also required, as is shown below. These requirements are identical to the current WGSL spec.


<table>
 <tr>
 <td>Host-shareable type S
 </td>
 <td>RequiredAlignOf(S, <code>uniform</code>)
 </td>
 </tr>
 <tr>
 <td><code>i32</code>, <code>u32</code>, <code>f32</code>, or <code>f16</code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>atomic&lt;T></code>, <code>T</code> is <code>i32</code>, <code>u32</code>, or <code>f32</code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>vecN&lt;T></code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>matNxM&lt;T></code>, <code>T</code> is <code>f32</code> or <code>f16</code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>array&lt;T, N></code>
 </td>
 <td>round(16, AlignOf(S))
 </td>
 </tr>
 <tr>
 <td><code>array&lt;T></code>
 </td>
 <td>round(16, AlignOf(S))
 </td>
 </tr>
 <tr>
 <td><code>struct S</code>
 </td>
 <td>round(16, AlignOf(S))
 </td>
 </tr>
</table>


The matrix `matNxM<f16>` types should be stored as N distinct `vecM<f16>` (not an array of vectors), and constructed back to a matrix by generated code. In this way we workaround the matrix stride requirements in different backends.

We can do emulation for Vulkan devices with no native support for this usage (`uniformAndStorageBuffer16BitAccess`) by packing FP16 into UINT32 in the backend buffer structure and do the unpacking in the backend when loading in WGSL.



## 6.3. Using FP16 types in `storage` storage class

When using FP16 types in the `storage` storage class, the [buffer layout constraints for storage](https://www.w3.org/TR/WGSL/#storage-class-layout-constraints) are validated, as is shown below.


<table>
 <tr>
 <td>Host-shareable type S
 </td>
 <td>RequiredAlignOf(S, <code>storage</code>)
 </td>
 </tr>
 <tr>
 <td><code>i32</code>, <code>u32</code>, <code>f32</code>, or <code>f16</code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>atomic&lt;T></code>, <code>T</code> is <code>i32</code>, <code>u32</code>, or <code>f32</code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>vecN&lt;T></code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>matNxM&lt;T></code>, <code>T</code> is <code>f32</code> or <code>f16</code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>array&lt;T, N></code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>array&lt;T></code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
 <tr>
 <td><code>struct S</code>
 </td>
 <td>AlignOf(S)
 </td>
 </tr>
</table>


In fact this is just requesting everything aligned to its type's alignment.



# 7. Using FP16 as pipeline input and output

It is possible to use FP16 in user-defined pipeline input/output variables. However some Vulkan devices, which support other usages in this extension like FP16 built-in function and storage buffer, don’t support this usage. For capabilities reasons, we may suggest not enabling using FP16 as pipeline input or output with this extension. However, this usage can also be emulated on Vulkan devices with no native support for this usage (`storageInputOutput16`).



## 7.1. Built-in input/output variable

Built-in floating-point pipeline input and output variables (e.g. `position` and `frag_depth`) should be kept as `f32`.



## 7.2. User-defined input/output variable

Using user-defined variables of FP16 types as shader input and output (and intra-shader variables) is partially supported by backends. MSL, HLSL (if supporting native 16-bit mode) and GLSL (with <code>[AMD_gpu_shader_half_float](https://github.com/KhronosGroup/OpenGL-Registry/blob/main/extensions/AMD/AMD_gpu_shader_half_float.txt)</code> extension) supports this usage, but Vulkan devices diverged. E.g., Nvidia Vulkan Windows drivers, Intel Linux Mesa drivers, and Google Pixel (before Pixel 6) all don’t support the required Vulkan feature <code>storageInputOutput16</code>. It is possible to emulate using FP16 types as intra-shader variables for devices not supporting <code>storageInputOutput16</code>.



# 8. Capability

Different vendors seem to have different support for FP16 related features, and different backend drivers of a vendor also have different support competence.

Detailed investigation on devices’ support are shown in [the proposed design doc](https://docs.google.com/document/d/1TdHVV8NgQYANAq967L_rRHdJ7Uq_L8CGvq7_VY6G4OQ/edit#heading=h.74yr911zu1fe).

WGSL Type	Description
`f16`	The `f16` type is the set of 16-bit floating point values of the IEEE-754 binary16 (half precision) format.
`vecN<f16>`	Vector of `N` elements of type `f16`
`matNxM<f16>`	Matrix of `N` columns and `M` rows

Type	form
`decimal_float_literal`	`/((-?[0-9]\.[0-9]+\|-?[0-9]+\.[0-9])((e\|E)(\+\|-)?[0-9]+)?(h?f)?)\|(-?[0-9]+(e\|E)(\+\|-)?[0-9]+(h?f)?)/`
`hex_float_literal`	`/-?0[x\|X]((([0-9a-fA-F]\.[0-9a-fA-F]+\|[0-9a-fA-F]+\.[0-9a-fA-F])((p\|P)(\+\|-)?[0-9]+(h?f)?)?)\|([0-9a-fA-F]+(p\|P)(\+\|-)?[0-9]+(h?f)?))/`

WGSL Type	Alignment (bytes)	Size (bytes)
`f16`	2	2
`vec2<f16>`	4	4
`vec3<f16>`	8	6
`vec4<f16>`	8	8
`mat2x2<f16>`	4	8
`mat2x3<f16>`	8	16
`mat2x4<f16>`	8	16
`mat3x2<f16>`	4	12
`mat3x3<f16>`	8	24
`mat3x4<f16>`	8	24
`mat4x2<f16>`	4	16
`mat4x3<f16>`	8	32
`mat4x4<f16>`	8	32

Host-shareable type S	RequiredAlignOf(S, `uniform`)
`i32`, `u32`, `f32`, or `f16`	AlignOf(S)
`atomic<T>`, `T` is `i32`, `u32`, or `f32`	AlignOf(S)
`vecN<T>`	AlignOf(S)
`matNxM<T>`, `T` is `f32` or `f16`	AlignOf(S)
`array<T, N>`	round(16, AlignOf(S))
`array<T>`	round(16, AlignOf(S))
`struct S`	round(16, AlignOf(S))

Host-shareable type S	RequiredAlignOf(S, `storage`)
`i32`, `u32`, `f32`, or `f16`	AlignOf(S)
`atomic<T>`, `T` is `i32`, `u32`, or `f32`	AlignOf(S)
`vecN<T>`	AlignOf(S)
`matNxM<T>`, `T` is `f32` or `f16`	AlignOf(S)
`array<T, N>`	AlignOf(S)
`array<T>`	AlignOf(S)
`struct S`	AlignOf(S)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal of FP16 WGSL extension #2512

1. Introduction

2. Proposal usages

3. FP16 data types

3.1. Scalar, vector and matrix types of FP16

3.2. Array and structure with FP16 types

3.3. Type constructor expressions and conversion expressions

3.4. Atomic type of FP16

3.5. FP16 literal

4. Defining FP16 variables in `private`, `function` and `workgroup` storage class

5. Using FP16 types as function parameter and result

5.1. User defined function

5.2. Built-in functions

6. Using FP16 types in `uniform` and `storage` storage class

6.1. Alignment and Size

6.2. Using FP16 types in `uniform` storage class

6.3. Using FP16 types in `storage` storage class

7. Using FP16 as pipeline input and output

7.1. Built-in input/output variable

7.2. User-defined input/output variable

8. Capability

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

WGSL usage	Vulkan (SPIR-V)	D3D12 (HLSL)	Metal (MSL)	OpenGL (GLSL)
Defining FP16 scalar, vector and matrix types	Requiring at least one Vulkan feature listed in this column	Require SM6.2 or higher, using DXIL, and `Native16BitShaderOpsSupported == true`	Support	Requiring extension `AMD_gpu_shader_half_float`
Defining array and structure types with FP16
Conversion and bit-casting between FP16 and other types
Atomic type/operation of FP16	No support
Using variables of FP16 types in `private`, `function` and `workgroup` storage classes	Require Vulkan feature `shaderFloat16` (overall support rate 39.6%, reported by gpuinfo.org, the same for others in this column)	Require SM6.2 or higher, using DXC, and `Native16BitShaderOpsSupported == true`	Support	Requiring extension `AMD_gpu_shader_half_float`
Using FP16 as parameters and results of user-defined value
Using FP16 with built-in function (arithmetical)
Using FP16 types in `uniform` storage class	Require Vulkan feature `uniformAndStorageBuffer16BitAccess` (overall support rate 53.1%)
Using FP16 types in `Storage` storage class	Require Vulkan feature `storageBuffer16BitAccess` (overall support rate 58.6%)
Using FP16 types in pipeline input/output user-defined variables	Require Vulkan feature `storageInputOutput16` (overall support rate 24.4%)

Backend	Details
D3D12	HLSL does not support atomic operations for float16.
Vulkan	SPIR-V only supports `OpAtomicFAddEXT` for 32-bit and 64-bit floating point, requiring `AtomicFloat32AddEXT` and `AtomicFloat64AddEXT`. The extension `SPV_EXT_shader_atomic_float_min_max` adds atomic min and max instruction on floating-point numbers, and the extension `SPV_EXT_shader_atomic_float16_add` proposes supporting atomically adding to 16-bit floating-point numbers in memory. However these are still in internal status. Vulkan extension `VK_EXT_shader_atomic_float2` (features) suggests 16-bit floating-point atomic operations on buffer and workgroup memory as well as floating-point atomic minimum and maximum operations on buffer, workgroup, and image memory. However, such features are generally unsupported (reported 99.8%).
Metal	MSL only supports atomic types of `int`, `uint` and `bool`.
OpenGL	OpenGL only supports atomic types of `int`, `uint` and `bool`. OpenGL only has the atomic counter of integer type.

Proposal of FP16 WGSL extension #2512

Description

1. Introduction

2. Proposal usages

3. FP16 data types

3.1. Scalar, vector and matrix types of FP16

3.2. Array and structure with FP16 types

3.3. Type constructor expressions and conversion expressions

3.4. Atomic type of FP16

3.5. FP16 literal

4. Defining FP16 variables in private, function and workgroup storage class

5. Using FP16 types as function parameter and result

5.1. User defined function

5.2. Built-in functions

6. Using FP16 types in uniform and storage storage class

6.1. Alignment and Size

6.2. Using FP16 types in uniform storage class

6.3. Using FP16 types in storage storage class

7. Using FP16 as pipeline input and output

7.1. Built-in input/output variable

7.2. User-defined input/output variable

8. Capability

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

4. Defining FP16 variables in `private`, `function` and `workgroup` storage class

6. Using FP16 types in `uniform` and `storage` storage class

6.2. Using FP16 types in `uniform` storage class

6.3. Using FP16 types in `storage` storage class