这是indexloc提供的服务,不要输入任何密码
Skip to content

Space efficient wide color support #4108

@litherum

Description

@litherum

Background

(If you already understand color theory, please feel free to skip this section.)

Let's discuss wide color. This is a term that probably requires some explanation.

For the sake of simplicity, let's pretend that sRGB is a 2-dimensional color space - just so we can visualize it a bit easier. Here's our visualization of all the colors in sRGB:

0 basic

Note how bottom left is black, top right is white, the Y axis is increasing red, and the X axis is increasing blue. The origin is at the bottom left of the image.

The "gamut" of sRGB represents the set of colors present in sRGB. Here's a visualization of the boundary of the gamut:

1 simplegamut

Okay. The goal here is to be able to represent colors more highly saturated than possible in sRGB. Here's a visualization of the colors we want to represent (also showing the gamut of sRGB as a reference):

2 widecolor

This is wide color. (It looks "wide" if you tilt your head to the side a bit...) We are essentially defining a new gamut, which is bigger than sRGB's gamut. All the colors within sRGB are present within our new "wide" color space, and then some.

3 gamuts

Note that the black point and the white point are the same - we haven't changed what black means, and we aren't supporting any brighter whites; we're only trying to represent more saturated colors: bluer blues, and redder reds.

(Aside: This is different than HDR, which would have a picture like this:)

4 hdr

(The same colors are present in HDR, but the values can go (much) beyond 1.0.)

Anyway, back to wide color. Let's imagine that our sRGB colors are represented with only 2 bits per channel. That would mean only these specific dots are representable:

5 representable values

Okay, so what do we do for wide color? It would be tempting to just move the dots around to cover the new larger gamut, like this:

6 representable values stretched

... But that's bad. The new gamut is bigger than the old gamut, so if we use the same number of dots, they get farther apart. This is bad because the distance between the dots is an intentional result of the biologic makeup of human eyes. Human eyes only have so much fidelity to distinguish adjacent colors, so there is an upper bound on how far adjacent representable colors can be in our color system. If the dots are too far apart, there become colors which are discernible by humans which are not representable in our system; whereas if the dots are too close together, you're just wasting information/bandwidth.

Okay, so we need more dots. So we want our system to look like this, instead:

7 deep color

The fact that there are more dots than there were before is called "deep color."

Okay, so now we have to map values to those points. It is tempting to just map the entire gamut from 0 to 1, like this:

8 colorspace conversion

But there's another way possible to do it, like this:

9 xr formats

This second way is a bit less intuitive, but the benefit is that compositing with sRGB content becomes trivial. If you use the first way, the compositor has to find a common color space, then map each color into it, then blend. However, if you use the second way instead, the color system is backwards compatible with sRGB, so you don't have to do any colorspace conversion to blend. Just blend on the values you've got.

Motivation

We already have FP32 and FP16 pixel formats. However, the downside to those is that, to hold an RGB color, you'd need 64 (or even 128) bits to do so. This is compared to the 32 bits that authors use today to hold RGB8 colors.

But doubling the memory use is a real shame, because the wide color gamut is only a little bit bigger than the sRGB gamut. It's nowhere near twice as big.

FP16 makes total sense for HDR rendering, where the new gamut really is multiple times as big as the sRGB gamut. But for wide color, it's a big waste.

This is particularly important because, in order for wide color to be useful during rendering, the materials in the world need to be authored in wide color too. This means that, not just the render targets would need to support wide color, but all texture assets would need it too. So we're not talking about a potential doubling of size of just a few resources; instead, we're talking about a doubling of size of potentially every texture. Doubling video memory use would probably be a dealbreaker for many content creators.

Prior Art

Metal has a family of pixel formats that match the behavior described above:

  • They are 10 bits per channel, meaning you get more representable values, therefore preserving the max distance between adjacent representable values, and avoiding banding
  • Backwards compatible. If you sample these new pixel formats and you get a value of (1, 0, 0), that means the same thing as if you had sampled a regular RGB texture and gotten the same value
  • The range of each channel is between -0.752941 to 1.25098. This is good because it goes beyond 0-1 by just enough but not a huge amount
  • The non-alpha formats are still 32-bits-per-texel, just like RGB8. The extra bits necessary for the extended range are taken from the alpha channel.

The specific texture formats are:

These Metal formats can be used for anything (including blitting) on Apple Silicon devices, but are not supported on Intel-based Macs. When blitting, they have a particular bit pattern, described in the docs above.

DirectX has just a single format like this: DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM. The docs say:

A four-component, 32-bit 2.8-biased fixed-point format that supports 10 bits for each color channel and 2-bit alpha.

I didn't test this, but this would seem to indicate a range from -2 to +2.

Also, it looks like support is behind a hardware capability bit somewhere, but the docs aren't very clear about this. It doesn't appear on the "Required DXGI Formats" documentation page. I couldn't find any documentation about which operations are supported on textures of this format.

Challenges

The XR formats do not have the same behavior in Metal and D3D12:

  • The Metal formats are BGR, not RGB
  • The 32-bit Metal formats don't have an alpha channel, but the D3D12 one does
  • The bit pattern for blitting is almost certainly different
  • D3D12 doesn't have sRGB variants (for gamma)

Possible paths forward

  1. (The null path) Throw up our hands and admit defeat. Tell authors that they need to double their memory use if they want wide color.
  2. Expose multiple distinct optional features, with the Metal formats behind a different feature than the D3D12 format.
    1. I think this would violate the "Requirements for Additional Functionality" document
  3. Invent our own meta format which turns into DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM on D3D12 and MTLPixelFormatBGR10_XR (the closest option) on Metal. Restrict the use of these textures to just situations where the differences are unobservable.
    1. This is probably too restrictive to be useful; I expect the only operation they would have in common would be usable as a render target
  4. Invent our own meta format (like (3) above), but try to claw back some of that functionality by polyfilling it. E.g. implement blits via a compute shader.
  5. Invent our own meta format (like (3) above), but intentionally give up on the goal of portability. Spec the new meta format as "it behaves either like DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM or like MTLPixelFormatBGR10_XR." Tell authors to write their code in a way that works for both (somehow).

There is a related question about how to actually show wide colors in canvas, but that isn't an issue for the WebGPU standardization group.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions