GPU Web 2025 03 25

GPU Web WG 2025-03-25/26 Pacific-time

Chair: KG

Scribe:

Location: Google Meet

Tentative agenda

Administrivia
CTS Update
Compat: raise maxStorage(Buffers/Texture)InFragmentStage from 0 to 4 #5125
Add optional feature "buffer-map-extended-usages" #5108
[Immediate Data] Shall we support per-stage immediate data? #5116
[Immediate Data] Add internal slot immediate data in GPUBindingCommandsMixin #5117
[Immediate Data] Support SetImmediateData() in RenderBundle #5118
[Immediate Data] Rename <immediate_data> to <immediate> #5119
Consider allowing overridden pipeline constants to refer to a non-existent identifier string #5112
Resurrect clamp-to-border #1305
Add device.simulateLoss(), and prevent mappedAtCreation on destroyed devices #5115
Agenda for next meeting

Attendance

Google
- Gregg Tavares
- Kai Ninomiya
- Ken Russell
Intel
- Hao Li
- Jiawei Shao
- Jie Chen
- Zhaoming Jiang
Mozilla
- Jim Blandy
- Kelsey Gilbert
Albin Bernhardsson
Xiaoshen X

Administrivia

Khronos hosted WebGPU+WebGL+glTF Meetup at GDC; ~100 attendees, 3 hours 15 minutes (!) of presentations

CTS Update

KN: external textures, fallback adapters, precision issues

Compat: raise maxStorage(Buffers/Texture)InFragmentStage from 0 to 4 #5125

Approved!

Add optional feature "buffer-map-extended-usages" #5108

JB: TOCTOU concerns
JB: GPU based validation is all within a command buffer, won't be interrupted by anything else.
KN: we consider the possibility of a compromised JS process. If the process still has the mapping, possibility that changes could be made.
JB: so from the standpoint of Web IDL, we still have a strict boundary of when JS has access and when the GPU does.
Concerns about the implications of this.
KG: 1) do we need to figure this out now, for this issue?
KN: only affects usages for this direct mapping. Maybe don't want to do indirect and index buffers.
KG: seems reasonable constraint, could loosen it later
KN: think we should go with this restriction, for both of them
KG: would be nice to have a high-level comment about what this is supposed to do. Want to make sure this doesn't accidentally enable things at the high level we didn't intend to. What do we want to do with this for now?
JS: received many comments after posting this - thanks for them. Mainly 4 issues left before we can complete this. First, almost everyone thinks this should be an extension, not a core feature. I also prefer WebGPU to do fewer implicit things, so this should be an extension. Second, security concern. Should disallow index and indirect usages here. Third, behavior of map-write only. Map-read is clear: while immutable ArrayBuffer is still in Stage 2.7 at TC39, can't do the triple-mapping technique. Always need CPU cache for the snapshot. map-write-only doesn't need to read the data back from the GPU side. With triply mapped buffers come questions. map-write-only, still can read the data. Reading from non-HOST_CACHED memory is very slow. Writing data randomly is very slow. Non-triply implementation, have to show its behavior. Always keep the data consistent between CPU and GPU? Fourth question, based on the third - should we split this into 2 extensions? Only allow map-read (... triple-mapping on UMA only...) and map-write, can be implemented on UMA and discrete GPUs too.
KG: sounds like still work left to do. Let's let this wait until it complete.
AB: one concern about allowing triply mapped buffers on map-read-and-write, but not map-read (or map-write alone? didn't catch this) would be confusing for developers, who would want to only put the usages they want. Read+Write, if it makes them faster, then read the data even though I don't need it.
KN: agree this is a concern. Should avoid surprising behavior like this.

[Immediate Data] Shall we support per-stage immediate data? #5116

SY: this is homework. We have suspicions that immediate data limitations are per-stage. Need to see across all APIs whether the behavior's portable (?).
SY: there are unified ranges on most APIs, but not Metal. Question is what other folks' opinions are.
Discussion of the problem
KG: what do we need to discuss in this meeting?
KN: this is the result of the homework. Understanding the namespacing in the backend APIs.
KG: knowing this, let's discuss.
KN: the functions we provide for setting immediate data, and where it goes, depends on the underlying API.
JB: you do a write; do both vertex and fragment see it? Have to decide which one. Or, have a set() that everyone sees.
KG: since it seems we want the unified behavior, do you need anything else on this issue?
KN: it's possible to do it the other way. Have to decide which way to go.
JB: any graphics devs on the call who have a preference?
KG: want someone to drive this; otherwise it's difficult to motivate.
GT: what's the tradeoff? Platforms with unified behavior, we'd have to manually split. Very ambiguous at this point.
KN: good point, need a better picture of what the impl would look like. Not clear whether one's technically better than the other in terms of impl efficiency. Don't want to push data to a stage that doesn't need it. Possible in both designs.
SY: seems we need to do more homework on the impl side to understand.
KN: maybe don't need to implement, but have to understand how both designs would work.
SY: OK. Will prototype more to help inform decisions.

[Immediate Data] Add internal slot immediate data in GPUBindingCommandsMixin #5117

KN: let's ignore this, just a spec detail
Discussion
KN: think the purpose of immediate data is that it comes along with the command buffer. Do think it should go on BindingCommandsMixin. Only constraint is to make sure it's implementable, which the next topic covers.
KG: sounds like impl detail to me. Does spec need internal slot for this?
KN: later, when we invoke the shader, this is how we say what the data would be.
KG: to me, this sounds like what we might do if you, for example, record which draw commands are done.
KN: we just have to agree that immediate data is on the command buffer. Not some other small buffer which you reference.
KG: So no need for internal slot, just say we make a copy when you give the data to us.
KN: agree.

[Immediate Data] Support SetImmediateData() in RenderBundle #5118

SY: want to discuss this. In my mind, two options. First, when we record SetImmediateData, we record the JS array content too. Replaying, we replace the commands with the content.
SY: option 2 - BindGroups, can replace buffers - maybe SetImmediateData can have flexibility. Encode commands, and some pointer to the JS array. Then fetch the JS array content again. There are some comments about that. Or, option 3, we don't support immediate data in RenderBundle. Adds states, maybe some problem like renderer/GPU process interaction during replay.
KN: I think Option 2 is not really possible. Would need, at time we're going to execute on the GPU, would need to fetch the data. RenderBundles are constricted by the impl. It's not just record/replay; it maps to specific backend concepts. If it were just record/replay, could pull data from a source when we call executeBundles, without any back and forth between processes. RenderBundles are designed to be secondary command buffers, so don't think this option is possible. There's appeal to get different data into RenderBundles. Would have to be inheritance from the surrounding RenderPassEncoders. Don't have that now, maybe could let bundles inherit a certain kind of state.
KN: I would go for Option 1 as long as it's implementable. On D3D, can certainly do this - it's part of the root descriptor. Just to see if in Vulkan you can put push constants on a secondary command buffer, not sure if you can do this.
SY: so Option 2 is not viable. Option 1 seems to be the only choice. Have to investigate more on bundle inheritance.
- KN: Think we have a bug filed about this. EDIT: I was wrong. Here's the issue where we investigated what would be inherited: https://github.com/gpuweb/gpuweb/issues/382
KN: i think there's an issue on RenderBundle inheritance. One issue on backend restrictions - I don't think any impl right now uses the D3D or Vulkan impl.
JB: all 3 implementations just record everything into memory.
KN: tricky then. Has been a proposal to drop requirement and just make RenderBundles record+replay. That's a separate discussion.

[Immediate Data] Rename <immediate_data> to <immediate> #5119

Approved!

Consider allowing overridden pipeline constants to refer to a non-existent identifier string #5112

KN: motivation: if you have a tool, like Slang, that produces WGSL with constants in it, it might not emit a constant if it's unused. Depending on compiler options, might not be able to set pipeline constants. Should we let you set constants not defined in the ShaderModule? think it'd be fine. Easy to make typos and have your data disappear. Think there needs to be a warning, and Slang would have trouble.
JB: if optimizations affect JS behavior, seems they're not good optimizatinos.
KN: agree, think Slang should complete all the constants.
KG: uniforms in WebGL, you can set null UniformLocations. Are there a class of things here? Compiler can elide things.
KN: at our (WGSL) level, we don't allow eliding this, and I think it's a good decision. I like the strictness. Can rename variables, it'll produce errors if you don't rename them correctly, even if it's just a warning.
KG: warning's fine.
GT: feel the opposite - if you're using TypeScript you get errors, but in JS you don't. Annoying. Sometimes just want to comment out a line. You can opt into warnings by adding tooling, but think this shouldn't be at the lower level.
KG: that's a vote to drop missing things on the floor?
GT: yes.
KR: WebGL had to allow it because there were a lot of existing OpenGL applications that relied on that behavior.
KG: Drivers can decide something isn't used and avoid giving a location for it. Applications need to be robust against that.
KN: don't think GL had to expose it in that way. We just say, if you gave us for an override that's not used, we won't upload it. We don't have to tell the user, don't upload this, because we optimized it out. Can do that ourselves.
KG: that's what I was saying. Sounds like we have agreement that warnings to prevent typos is probably to be encouraged, but that we shouldn't be too strict here, and allow overridden pipeline constants to refer to nonexistent identifier strings.
KN: personally I don't think there's much reason to make a change. Concrete reason is that Slang optimizes things out. If we think it shouldn't do that, no need to do this.
KR: any risk one WGSL implementation is cleverer than another and optimizes out an override?
KN: No. Only talking about override names that are not declared at all in the WGSL shader text.
KN: basically saying, Gregg's reason to do this is valid, but not a hard requirement w.r.t. the way Slang optimizes things.
GT: I don't feel strongly, if there's an option to Slang to spit everything out, doesn't bother me.
KG: weak vote to drop things on the floor if they don't exist. Kinder to the developer.
KG: we'll say that we don't feel strongly about this and move it out of Milestone 1. Can consider for Milestone 2.
KN: hopefully this'll go to Slang and they won't say it's impossible.

Resurrect clamp-to-border #1305

KN: on Apple, there's only one clamp-to-border, which clamps to 0. fixed-function version of clamp-to-border with a fixed value. Not sure why we didn't discuss this before. This behavior should be implementable on the other backends because they're more flexible. Should add it. And, is clamp-to-border with additional color options (later Metal versions offer 3 versions) desirable? I don't think so, haven't heard anybody ask for it. But things that are exposed on all backends we should expose.
KG: sounds like what people want - to expose what's there.
Approved. Clamp-to-zero.
KN: a little question of how it's designed. In Metal it's an address mode. Do we want to call it that? Or say there's a clamp-to-border option, but the only color you can clamp to is transparent black or opaque black, depending on the format?
AB: even if you want an optional feature for the additional things, would want to separate it. Make the feature available everywhere, and the things only supported on some hardware, add an optional feature for it.
KN: agree. Design question is essentially, should we design for that possible future? I think we should, it exists on backend APIs.
KG: will think about it when someone makes a proposal.

Add device.simulateLoss(), and prevent mappedAtCreation on destroyed devices #5115

KN: addresses last week's discussion.
GT: think simulateLoss might be good, but not sure I know enough about how loss works to know how to use it. In WebGL loss can happen at any command. Adding mappedAtCreation failing will throw everything off.
KN: no. Only prevents it on devices that've been destroyed. simulateLoss does the same thing as a natural device loss. Wouldn't stop you from creating buffers mappedAtCreation. Both stop you from mapping stuff asynchronously. Bit unfortunate. Talked about faking the mapping if the device is lost; didn't do that because it'd be more work for impls to fake mappings.
KN: don't think it's very hard to do so might want to consider doing it at some point.
KG: my concerns are half-similar to Mike's on the PR. Would rather have simulateLoss and not destroy() - that's what we have in WebGL. Think destroy is less important than device loss.
KN: only difference is in buffer mapping, which doesn't exist in WebGL.
KG: if you wanted that to not happen, try harder in the impl, I'd say. More imp't for impls to figure out whether device loss will cause problems with the app, then to make it slightly simpler for them to make a mapped buffer when the other side's destroyed the context. Bunch of ways you can monkey–patch and implement destroy yourself. Have to give you the most things you can't do yourself.
KN: would you then propose we change the current behavior of destroy() so it doesn't unmap buffers?
KG: I think that'd be great.
KR: need to talk with partners, make sure that they aren't surprised that we aren't cleaning up their memory. Google Meet has already raised this issue at the Wasm level.
KN: I'll write something up.

GPU Web 2025 03 25

GPU Web WG 2025-03-25/26 Pacific-time

Tentative agenda

Attendance

Administrivia

CTS Update

Compat: raise maxStorage(Buffers/Texture)InFragmentStage from 0 to 4 #5125

Add optional feature "buffer-map-extended-usages" #5108

[Immediate Data] Shall we support per-stage immediate data? #5116

[Immediate Data] Add internal slot immediate data in GPUBindingCommandsMixin #5117

[Immediate Data] Support SetImmediateData() in RenderBundle #5118

[Immediate Data] Rename <immediate_data> to <immediate> #5119

Consider allowing overridden pipeline constants to refer to a non-existent identifier string #5112

Resurrect clamp-to-border #1305

Add device.simulateLoss(), and prevent mappedAtCreation on destroyed devices #5115

Agenda for next meeting

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!