这是indexloc提供的服务,不要输入任何密码
Skip to content

GPU Web 2025 02 19

Corentin Wallez edited this page Mar 5, 2025 · 1 revision

GPU Web WG 2025-02-19 Atlantic-time

Chair: CW

Scribe: KR, KN

Location: Google Meet

Tentative agenda

  • Administrivia
  • CTS Update
  • Index resolution is potentially underspecified #5064
  • maxDynamicFoo should be required to be at most maxFoo in limits #5069
  • Add GPUTextureView support to importExternalTexture() #5068
  • "shader-f16" requirements exclude all Qualcomm devices #5006
  • Make the last 2 arguments to copyBufferToBuffer optional #4807
  • (late request) Bikeshed "webgpu-core" feature name #5036
  • Triage milestone 1 issues
  • Agenda for next meeting

Attendance

  • Google
    • Corentin Wallez
    • Geoff Lang
    • Kai Ninomiya
    • Ken Russell
  • LunarG
    • Mark Young
  • Microsoft
    • Rafael Cintron
  • Mozilla
    • Jim Blandy
    • Kelsey Gilbert
  • Albin Bernhardsson
  • Mendy Berger
  • Mehmet Oguz Derin

Administrivia

  • There's a new time!

CTS Update

  • CW: Weird flakes when running many tests in sequence, texture builtin tests reading back all zeros. Not a chromium bug because it repros in dawn-node, might be a CTS issue, might be a Dawn issue.
  • KR: Jim, can you try running these tests against Firefox or wgpu and see if you reproduce the same issue?
  • CW: Will send you info on what to test on Matrix.

Index resolution is potentially underspecified #5064

  • KN: Think just a typo basically; forgot to specify this bounds condition.
  • JB: Agree, seems editorial.
  • KN: Labeled as copyediting.

maxDynamicFoo should be required to be at most maxFoo in limits #5069

  • CW: in Compat there are some dependent limits - inequalities between limits. E.g. maxStorageBuffersInFragmentStage <= maxStorageBuffersPerShaderStage (?). maxDynamicStorageBufferPerPipelineLayout - probably needs to auto-increase limit that's maxStorageBuffersPerStage.
  • JB: this part of the spec is assumptions programmers can make when usign this data. If you have a limit on something and another limit that's more restrictive than that - unclear what to do if more-qualified one is less-qualified. So it seems more logical to promise it to people that increasing one increases the other.
  • KN: reason I didn't put this one in - they're not the same. One's per pipeline layout, one's per shader stage. Meaningful for one to be greater than the other in either direction. Don't think we can make a strict inequality between these, but can probably make a more complicated guarantee involving multiplications by numbers of shader stages.
  • CW: yes, sounds too complicated.
  • KN: we did this for bind groups though. Maybe should do it for this though? Not sure if it's possible.
  • CW: let's forget this then.

Add GPUTextureView support to importExternalTexture() #5068

  • CW: want shader code to be able to take in either a video frame or a texture. Meet really wants this. Discussed 6 mo ago how this is possible. If we follow the direction here, there are a few points of detail. 2 ways. First, create GPUExternalTexture from textureview. Similar to wrapping software-decoded RGBA video data. Second, allow, when creating a BindGroup, binding a GPUTextureView object to an external texture binding slot.
  • CW: slight preference for creating external texture from texture view. Nice advantage: lets apps test GPUExternalTexture code path without having to create video frame object. (Can also create VideoFrame object all the time on the web platform too…)
  • JB: what about Kai's comment on #4504 ?
  • KN: went back to notes; Kelsey mentioned there are external texture semantics we could reuse. [I don't know what those were, but that comment got me thinking…]about expiry, immutability, and color management. Expiry + immutability - different from external textures coming from videos. We could say these aren't immutable, or say they'll expire every time you change the underlying data. Prefer the first option. Second, color management - not color managed. Not a big deal, need to just choose a behavior. I prefer binding a texture view to the binding point, but none of these are a big deal.
  • KG: I'd prefer to avoid creating new APIs on the binding side. Discussed this a while ago so don't remember my thought process exactly. Looking fresh, having something where we allow binding to the bind point of TextureViews is easier than doing importExternalTexture, which I was preferring before.
  • MW: agree with KG, think allowing TextureView on bind point of ExternalTexture is simplest. Going through importExternalTexture path is "odd". But neither path is too complex from implementation or usage standpoint.
  • CW: weak consensus that setting it at the binding side is better. Good because it avoids some of the problems in that PR. Can ask François to change the PR. It's implementable.
  • KR: from the shader standpoint, there's code that's injected to handle YUV incoming data. What do we do if the incoming data is RGBA because it comes from GPUTextureView?
  • CW: can already have RGBA frames - already a problem you have to deal with. Our code already supports that.
  • MW: in Safari we multiply by a color matrix - for RGBA, just use the identity matrix.
  • CW: what's constraints to preempt questions? Single sub-resource? Color format that's sampleable, and filterable? Do we require it to be 4 channels?
  • KR: are there guarantees that if you sample an R texture, G=B=0?
  • KN: yes. only place we don't guarantee all color channels is sampling depth/stencil texture.
  • CW: good, we'll tell François to go forward with something like that. Thank you!

"shader-f16" requirements exclude all Qualcomm devices #5006

  • CW: on Vulkan, shader-f16 extension requires uniform / storage buffer 16-bit access feature. Not supported on Qualcomm. We require that for granularity of read/write accesses because not possible to emulate 2-byte access if you can only write 4 bytes. Adreno devices don't support the uniform 16-bit part. Can be emulated because they're read-only. Similar to rewrites for HLSL ByteAddressBuffer. But requires extra work.
  • CW: my suggestion - requires more compiler work to support these devices on Vulkan, but worth it.
  • JB: open to adding the polyfill to support these devices. In Firefox would probably drop support for the devices initially, then add the polyfill and get support back.
  • CW: that's the case for Chromium right now as well.

Make the last 2 arguments to copyBufferToBuffer optional #4807

  • KN: gman and I discussed and thought that min of source size and source offset (??)
  • JB: we can't get rid of sourceOffset, too bad, but that ship's sailed. Dest buffer follows it. Would have to do a weird overload.
  • CW: we could add a second overload.
  • KN: was going to suggest that.
  • JB: we'd like that, but it would be another PR.
  • CW: in webgpu C headers what would we do?
  • KN: wouldn't need to do it in the C header. Require you to specify 0 in C.
  • CW: then the only cost is testing.
  • KN: I don't care much about removing the offset. It's only 2 characters. We could only make the size optional, but OK with the overload. We'd have the current overload where size becomes optional (buffer, offset, buffer, offset, size?), and another with no offsets: (buffer, buffer, size?). Buffer, Buffer, Offset, Size is kind of weird; which does the offset apply to?
  • CW: does someone really care about doing this simplification? Do we want to spend time on figuring out multiple overloads, or just skip it?
  • KN: I'm happy to figure it out. The overload's not too complicated.
  • CW: folks interested, please iterate on proposals on the issue, we'll come back to it another week.
  • KN: I'll put up IDL for my proposal.

(late request) Bikeshed "webgpu-core" feature name #5036

  • JB: want to make progress, doesn't commit us to anything. Mike and I both approved this PR at this point. Want to discuss the actual names.
  • KN: if people want to land this with the TODO we can do it. We want to start implementing it.
  • JB: Moz's first choice is "core-features-and-limits" - precise, but verbose.
  • KR: do you really want to write that in every WebGPU app?
  • JB: mostly for apps, not really demos.
  • KN: you really only write this if you care about supporting both Compat and Core.
  • CW: what's nice is it's forward-thinking - we will want to say "features and limits" in the future.
  • CW: OK, let's go with this.
  • KN: SGTM.
  • CW: goal is to remove issues from Milestone 1, and for folks to propose features be added to it. (later is also OK)
  • KG: On WGSL, we thought we were basically at the end of Milestone 1.
  • CW: so should most of these be Milestone 2?
  • KN: can move everything up a milestone in the API level.
  • CW: OK, let's try that.
  • List of issues to discuss with TC39, WASM, JS engine teams #747
    • Ongoing
  • Support for arrays of textures #822
    • CW: landed a proposal for this, assume we still want it.
    • CW: I would like to keep this in Milestone 1. Actively working on it.
  • Multisample Coverage on Metal #959
    • CW: When you set sample mask on pipeline as well as alpha-to-coverage, it's an error? Propagated that error to WebGPU, can't use both at the same time. Do we care about this?
    • KN: it will naturally become obsolete because it only applied to AMD and Intel Macs.
    • CW: will close.
  • DrawIndirectCount #1354
    • CW: multi-draw-indirect - probably not doing this in the next quarter.
  • mapSync on Workers - and possibly on the main thread #2217
    • KR: think we won't do this next quarter. Want to do it carefully and come back to this group with compelling data from a partner product team.
  • Cannot upload/download to UMA storage buffer without an unnecessary copy and unnecessary memory use #2388
    • CW: Jiawei at Intel is working on this. Might come back with proposal this quarter.
    • KG: regardless will need a lot of impl work. Think this is Milestone 2.
    • CW: OK.
  • Expose optional texture capabilities in WebGPU #2630
    • CW: Texture format tiers etc.
    • CW: maybe Teo can make a proposal for this soon? It's cheap to implement, just updating tables.
    • KN: Teo already has formats-tier-1 proposed. Answered the question of how to expose this using features. Will close this one in favor of the other.
  • Generalize pipeline fragment output validation to no require a fragment output if the pipeline's color target is "unwritable" #2770
    • KN: Milestone 2.
  • Collaborate with the immersive web groups for WebXR interop #2778
    • KR: Brandon Jones has been working on this; think the API is well settled. Chromium's impl almost done.
    • CW: closed.
  • 16-bit normalized texture support #3001
    • CW: it's also part of the other issue. Closing this.
  • Bump macOS version to 10.14 #3685
    • CW: Chrome requires macOS 11 now, not sure about FF and WebKit. More stuff we could add now.
    • KN: FF requires 10.15
    • MW: WebKit requires macOS 13
    • CW: will edit the issue. Let's see what we can add.
  • maxBufferSize questions #3700
    • MW: not sure we can go much higher than 256 MB on mobile devices. Specific customer asking for this?
    • CW: had many people in the past complain to us on Chromium because max storage buffer size bucketing prevented them increasing it the way they wanted for ML workloads.
    • MW: concerns that developers will try for a 1 GB ArrayBuffer and app won't work on iPhone.
    • CW: sounds like we don't want to do this.
    • KG: don't know - want to make things better. There are ways to stay under 1.5 GB and stll have a 1 GB buffer on the GPU.
    • CW: since not obviously "close", let's keep this in M1.
  • Proposal: formats-tier-1 extension #3837
    • CW: simple, small, let's keep.
  • Proposal: read-write storage textures #3838
    • CW: We have these, but this adds more tiers.
    • Let's skip the rest of the things with the texture-capabilities label and get to them all at once sometime.
  • Problem measuring time deltas with writeTimestamp #3952
    • CW: Seems like we had agreement in the meeting to close the issue.
    • Closed.
  • Require render bundles to be no slower than non-render bundles for the same or similar set of commands #4104
    • KG: Feels like a request for good implementations.
    • AB: Don't think there's any way we could possibly guarantee that.
    • CW: We can always put a non-normative note.
    • AB: Implementations may want to use secondary command buffers, and on some platforms those will definitely not be faster.
    • JB: wgpu does not use secondary command buffers, we just record and replay.
    • CW: Interesting! We could potentially lift some restrictions on render bundles.
  • Copying depth32float to depth32float #4125
    • KG: Fine for M1.
  • what happens if vertex w position is zero, or negative? #4129
    • KG: M1 to me
    • CW: Close?
    • KN: Think we should add a way to do this.
  • CW: let's triage the rest later.

Agenda for next meeting

  • Compat
  • (stretch) Triage the rest milestone 1 issues
  • WGSL meeting again on March 11. KG: recommend meeting on March 12, but we could do more triage in the meantime.
  • CW: would like to complete Compat discussions as much as possible in this group.
  • KG: meet next week?
  • CW: yes. Compat, more triage.
  • KN: Stephen's out next week too.
  • CW: meet in 2 weeks then. March 5.
  • Agenda requests!
    • Scheduling a F2F?
Clone this wiki locally