GPU Web 2024 09 11

GPU Web WG 2024-09-11 Atlantic-time

Chair: CW

Scribe: KR

Location: Google Meet

Tentative agenda

Administrivia
CTS Update
[M0] Error on viewFormat with no compatible usage? #4852
[M0] textureGather issues #4857
[M0] Canvas contents after device destroy or device loss #4859
Compatibility Mode: Support OpenGLES devices with zero SSBOs and image uniforms in vertex, fragment stages #4721
Hardware feature levels (and compat) #4656
- PR: https://github.com/gpuweb/gpuweb/pull/4868
[M1] Feature detection for HDR canvas #4828
Agenda for next meeting

Attendance

Apple
- Mike Wyrzykowski
- Myles C. Maxfield
Google
- Brandon Jones
- Corentin Wallez
- Geoff Lang
- Gregg Tavares
- Kai Ninomiya
- Ken Russell
Microsoft
- Rafael Cintron
Mozilla
- Jim Blandy
- Kelsey Gilbert
Nvidia
- Anders Leino
Xiaoshen X

Administrivia

CW: Add items to the F2F agenda pls!
FD: The current charter expires end of November. We are looking into re-chartering. The main change that we envision is switching to the eternal CR model ("living spec"). Plan is for the group to review and confirm the draft charter during our upcoming F2F end of October
- CW: would be nice to declare we're ready for CR at F2F. Kelsey had nice suggestion of 45 day review period. Can then go on holidays with a feeling of accomplishment. :)

CTS Update

Continued progress. Finding bugs in implementations and drivers :)

[M0] Error on createTexture’s viewFormat with no compatible usage? #4852

(not createTextureView)
BJ: we have opportunity to raise an error if you create a textureview with a viewformat incompatible with the usages on the top-level texture
BJ: previously wanted to have browser raise a warning. CW says warning might still be the right way to handle this because otherwise app has to
KG: generally in favor of permitting the useless thing
Consensus: leave this as a warning

[M0] textureGather issues #4857

GT: lots of stuff broken :) textureGather works on NVIDIA and Apple Silicon. Lots of problems on other GPU types right now. Check the bug for details.
JB / CW: Linux has 2 different open-source AMD drivers right now.
CW: point is, textureGather's broken for a lot of use cases on a lot of drivers. 1) computations are wrong in some way, like incorrect clamping at the edge of cube maps. it returns 0. etc. 2) mac amd textureGather of a stencil texture, which hangs the machine.
GT: just using it doesn't hang the machine; something about the test does. the part of the test harness which makes a lot of texture samples and tries to figure out exactly where things went wrong does hang the machine.
KG: easy thing - disallow for the really broken use cases?
CW: doesn't disallow the stencil hang.
KG: could iterate from there. Are we OK dropping some of these in the chart?
MM: what's the polyfill story?
GT: tried polyfilling cube coords. Worked on systems that didn't need it, but didn't work on the systems (Mac Intel specifically) that did need it. That was 3D array coord. Using textureSampleLevel. 3D - the emulation I tried didn't work.
GL: is the other commonality integer formats?
GT: no, the emulation was for floating-point formats.
GL: the driver bugs. textureGather for int formats?
GT: lots of places. unorm, etc.
JB: if there aren't polyfills, have to decide whether it's better to ship and say it's a driver bug, or to withdraw the feature.
CW: Geoff pointed out that textureGather of depth cubemaps is used for shadows in engines.
GL: yes. Blending of shadow samples. Cube map shadows are a thing. Don't know if people are using them together.
KG: would be useful to get confirmation from IHVs that these are problems.
CW: there's hardware errata (found in Mesa) about hardware gather of uint textures.
MM: what actions would we take in response?
KG: confirmation that we're not doing something wrong in our code. Off-chance they say they'd like to fix it quickly.
MM: if they tell us how to polyfill it, that'd change our action. We'd polyfill and continue to ship the API.
KG: maybe. You do textureGather because of performance; unfortunate to polyfill it into samples.
JB: if games are using gather on this hardware - and when we try it it's broken - that's weird.
MM: options (a) throw the feature out (b) try to carve out a particular set of configurations that games use and work. Everything else not allowed.
BJ: issue is really just with uint/sint values. Gather's almost exclusively used on float / depth16 values, for shadow mapping. Path games are using is the one that's known good, and working. Not sure why you'd gather from these formats.
KN: breaks with ints, offsets, or cubes.
GT: just double-checked. Win/Intel - cube coords are all broken, where it wraps. Maybe you don't notice that problem.
KG: might have been a bug when GPUs transitioned to seamless cubemaps everywhere.
KN: if we discovered that, maybe we'd clamp all textureGathers so it's inside a face.
CW: I don't want to revert back to 20 year old behavior because of this.
CW: seems cube and cube array kind-of-work. (Deferring the edge/corner case.) Say cubes are just a driver bug, we report it, it gets fixed. Then there's uint. Until we figure out if we can workaround it, suggest we disallow it. Then there's offsets
BJ: offsets seem like the easiest one to polyfill.
GT: for gather it's easy to polyfill. You add the offset to every mip. Can't add it to the normalized coordinate
MM: possible to polyfill, just have to do filtering in software.
SW: also what's the wrapping behavior?
GT: supposed to be the same.
CW: proposal: leave in cube maps, report bugs. leave in offsets, report the bug to pixel team. Disallow uint for now.
GT: sounds good to me.
CW: Google will report bugs to AMD, Intel, and Pixel team
MM: we probably want a test in the CTS that tests the edge case only working on some devices. Maybe make that test a "may pass" rather than "must pass".
KN: we'll talk with the vendors whether this is a bug or undefined behavior. But if it's a driver bug want to keep it as a requirement.
MM: from an author's point of view, they don't care whether the bug's in the driver or the WebGPU implementation.
CW: think we're in agreement about the goal of the CTS.

[M0] Canvas contents after device destroy or device loss #4859

KN: going in same direction as last meeting. Good argument that we shouldn't make implementations keep canvases alive after the device is lost or destroyed. There's a way to manually save rendered results - ImageBitmap, etc. Proposal - change spec. When device is lost, change how the canvas is displayed.
KG: A concrete usecase for device loss preserving data is when we kill devices in long-idle background tabs, or when we hit some global limits and want to free up space from (long-?)idle devices. In this case, the least disruptive thing for a user is showing the same canvas contents that the tab had when it was last active. Making some attempt at the browser level to preserve it. There's some desire for device destroy to behave the same as device loss, esp. for testing. If we want device destroy == device loss, want the same behavior. There are other details. Long-idle background device isn't mid-frame in rendering.
KN: that's a good point. I was looking for reasons we'd need to support one device being lost naturally.
BJ: in that case where you have something collected for memory reasons on long-backgrounded tab. Regardless, when you go back to the page it'll be broken. You might see the preserved image but it'll be broken - non-interactive. If developer wants to avoid that they need to handle context loss, and if they do, the page will behave correctly. I'd rather the page show blank canvas / sad canvas rather than something that looks at the surface level like it's working, but actually isn't. Sounds like we'll get bug reports like "mouse interaction's broken on this page" when in reality the device was lost.
KN: yes, think apps should handle that, and they have a way to.
JB: if tab's navigated away from, it stops getting rAFs. Foregrounded, it starts getting them again. so if it's foregrounded it can recover. Or if user starts interacting, it'll get the device loss callback. Not clear to me that killing the device results in broken pages when you come back to them.
CW: here it sounds like we're only talking about losing / destroying the device after rAF. Can we agree on what happens if you destroy the device before rAF? Is that valid?
KN: 2 options. 1) goes blank 2) never display the frame. Discard that frame, keep the old one.
JB: in Moz's discussions, prefer to retain frame N-1. Brandon brought up in discussions that that'd constrain implementations. If we decide the behavior’s valuable, we prefer to constrain impls.
BJ: Chrome and maybe Moz supply textures for the canvas from outside the WebGPU device. So when device goes away the texture isn't destroyed. Could say that all browsers have to do that - and good likelihood that they already are. But is this an impl detail we want to force on WebGPU implementers?
JB: we're suggesting exactly that.
MW: initially Safari was along the opinion of preserving contents. But allowing contents to be cleared has a benefit - device loss due to memory pressure, can also deallocate the backing store previously allocated. Depending on memory pressure severity, may need to do this to avoid the OS crashing the GPU sub-process.
CW: this is if device is lost after rAF?
MW: can happen at any time
KN: relevant either way
JB: all we're talking about is "should" vs. "must" language.
CW: could spec what happens on device destroy. Not device loss.
JB: OOM seems more like device loss than device destroy.
CW: how about - if before rAF ends, you destroy the device - previous contents are displayed, new content is never displayed. (MW: OK.) If device lost after rAF, try to do same thing as device destroy, just might not be able to.
JB: this is a portable question.
RF: I don't care too much what we spec about manually calling device.destroy(). When it does get removed for real, there's no preserving - the old state doesn't exist any more. Read back on periodic basis? Don't think we should.
KG: that's why I think we should say "should" in the spec. This isn't "must" behavior. There are cases where I think we should help users along.
CW: think we agree on the two sides. Just device destroy after rAF. Suggest - if you destroy the device after rAF, previous contents are shown. Consistent.
KN: I like it.
MW: thumbs up
Consensus: device destroy either during or after rAF causes previous canvas contents to be displayed.

[M0] [very late addition] Alpha to Coverage spec and conformance on Qualcomm hardware #4867

Please look offline

Compatibility Mode: Support OpenGLES devices with zero SSBOs and image uniforms in vertex, fragment stages #4721

SW: since we last talked - in agreement that we don't want to drop 46% of Android userbase. We'll give them Compat. Won't require this feature by default. But should we add the feature in? I prefer to add it back in as limits rather than an optional feature. Either way, have to add it to core.
SW: Teo's latest - we should increase the # of storage textures in core from 4 to 8. Think we should discuss that orthogonally.
KG: talked with Teo briefly. Clarification - he'd prefer if we had max storage buffers for vertex stage. If adding them for Compat, but have them immediately deprecated for Core purposes. Then continuing with Core without the limits. Using them only as much as we need them for Compat - but try to get people to not use them for Core.
KN: discussed not enforcing them for Core, In the struct, but they don't do anything.
CW: agreement to add limits in Core with that caveat. Allow the limit to be 0 on Compat. Require that the limit advertised by the browser is the same as the max per-stage number of textures (?).
SW: should be equal to the existing limit in Core always. Existing limit in Compat will apply only to Compute. A bit strange because the name doesn't reflect that. "per stage" instead of "per compute stage".
KG: think clarifying text will handle that.
CW: with introduction of the limits, and what they mean in Core, resolves this issue.
CW: Last topic - increasing limit from 4 to 8, pushing some devices down to Compat.
SW: if Adreno 500 is the one - that barely supports Vulkan. Want to gather internal data before final decision. Have homework to do.
SW: I'll put together a proposal for the first part.

Hardware feature levels (and compat) #4656

PR: https://github.com/gpuweb/gpuweb/pull/4868
JB: Moz looked at the new PR, seems fine.
KN: I changed the name from feature level to capability level this AM. Not attached to the name.
MW: Apple likes the PR as well.
CW: how to structure further discussions on this?
JB: would like to postpone further discussion until we have more features to discuss.
KN: have to figure it out for Compat. That's near-term.
JB: right. Compat will add one more valid value. We'll explain that for that, you may get upgraded. Can't assume you're getting Compat validation.
CW: great. We accept the PR…
KG: think Feature Level is better than Capability Level
KN: it's features + limits, these == capabilities.
… discussion …
CW: prefer feature level
BJ: sort of in favor of feature level too
CW: we reserve the feature level keyword. Figure out Compat. Everything else is future work.

[M1] Feature detection for HDR canvas #4828

Agenda for next meeting

Let's meet next week . Week after is TPAC.
M0
- [M0] Alpha to Coverage spec and conformance on Qualcomm hardware #4867
- [M0?] Guarantees about early-z fragment discard #4878
- [M0] Canvas contents after device destroy or device loss #4859
M1
- [M1] Feature detection for HDR canvas #4828
- [M1] Allow only strings in featureLevel #4874
- (maybe) (will just present a proposal, no prep needed) [M1] proposal to use featureLevel in the Compat proposal #TODO
- (maybe) [M1] [editorial] Consider if there's a much better name for "image copies" #4573

GPU Web 2024 09 11

GPU Web WG 2024-09-11 Atlantic-time

Tentative agenda

Attendance

Administrivia

CTS Update

[M0] Error on createTexture’s viewFormat with no compatible usage? #4852

[M0] textureGather issues #4857

[M0] Canvas contents after device destroy or device loss #4859

[M0] [very late addition] Alpha to Coverage spec and conformance on Qualcomm hardware #4867

Compatibility Mode: Support OpenGLES devices with zero SSBOs and image uniforms in vertex, fragment stages #4721

Hardware feature levels (and compat) #4656

[M1] Feature detection for HDR canvas #4828

Agenda for next meeting

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!