这是indexloc提供的服务,不要输入任何密码
Skip to content

GPU Web 2024 07 10

Corentin Wallez edited this page Jul 16, 2024 · 1 revision

GPU Web WG 2024-07-10 Atlantic-time

Chair: CW

Scribe: MM/KR

Location: Google Meet

Tentative agenda

  • Administrivia
  • CTS Update
  • (timebox) Consider moving sliced 3D compressed BC texture to an extension. #4705
  • (timebox) ASTC Compressed for Sliced 3D as Extension #4701
  • (timebox) Push constant proposal #4612
  • Compatibility Mode: Support OpenGLES devices with zero SSBOs and image uniforms in vertex, fragment stages #4721
  • Clarify that create*PipelineAsync must not generate a validation error #4714
  • mapAsync early-reject doesn't generate a validation error #4690
  • Consider if there's a much better name for "image copies" #4573
  • Depth Bias does not affect line and point primitives. #4729
  • (stretch) Hardware feature levels (and compat) #4656
  • Agenda for next meeting

Attendance

WIP, it is the list of all the people invited to the meeting. In bold the people that have been seen in the meeting:

  • Apple
    • Mike Wyrzykowski
  • Google
    • Brandon Jones
    • Corentin Wallez
    • Gregg Tavares
    • Kai Ninomiya
    • Ken Russell
    • Stephen White
  • Mozilla
    • Jim Blandy
    • Kelsey Gilbert
    • Teodor Tanasoaia
  • Mehmet Oguz Derin
  • Myles Maxfield
  • Xiaoshen X

Administrivia

  • KN: FYI Google is formulating a proposal about shader-creation vs pipeline-creation errors (for compat, etc.), will want to discuss soon with both API and WGSL groups #4728
  • KN: There are some things that have worked not as well as we expected when we discussed this last. We are formulating a proposal to loosen some of this. Look forward to it.
  • KN: This will be an API/WGSL crossing discussion. I’m not sure how to schedule it. WGSL group maybe? We will look at all the cases and put together a proposal to have a consistent idea of how everything should work.

CTS Update

  • KN: The big thing is GT has been working on the changes to the CTS needed for the “either” change. Part of that is making it so a bunch of tests work in compat by specifying that they allow either. Also there are a couple tests that test the new thing.
  • KN: Continued work on texture load/store/sampling.
  • KN: Also WGSL testing going on as well.

Consider moving sliced 3D compressed BC texture to an extension. #4705

  • CW: Seems … I’m not 100% sure but it seems KN has another approach that satisfies most people?
  • KN: Based on all the discussions we had, I wrote down some options for how we could design this.
  • KN: The ASTC, BC, and ETC2 extensions would provide the same functionality, and the separate sliced-3d extensions for ASTC and BC. We would do this so the different formats mirror each other exactly in how they’re sliced, even though BC always supports sliced-3D.
  • KN: If you have BC, you’ll always have bc-sliced-3d
  • CW: Is it a guarantee the extensions come together?
  • KN: Yes.
  • JB: Mozilla is OK with it
  • OD: I believe the asymmetry is given for texture format, whether it be different block sizes, different color profiles. But if it’s a consensus to see this as a better developer experience to expose this as a similarly-named extension as BC, I think it is a good way to move forward.
  • CW: Thank you.
  • CW: Fixing it goes in M0?
  • Yes.
  • OD: I volunteer to PR the change.

ASTC Compressed for Sliced 3D as Extension #4701

  • CW: This one should have easy consensus. We should just do it.
  • KG, MW, others: 👍
  • OD: I volunteer to PR it.

Push constant proposal #4612

  • CW: We have a general agreement on the general shape: set immediate data. Is there a variant that… How is it spelled? setImmediateData(float)? Offset with array buffer?
  • CW: Everyone is interested in array buffer variant, but single scalar variant are not needed and can be polyfilled.
  • CW: Given this, I’d like to suggest we just do typed array for now. If we get feedback, we can add the scalar versions.
  • MW, KN, KG: thumbs up / nods
  • JB: In WGPU we have something similar, but it lets you supply different values for vertex and fragment stages. Either we haven’t implemented this correctly on every platform, or we could consider this for the spec. If it’s possible and we don’t offer it, then we’re halving the number of push constants that are available, because people will have to provide a combined set of push constants for both stages.
  • CW: I might be misremembering, but in Vulkan, it’s a single space. You can segment it in ranges, but when you set constant #4, it sets it at that index for all stages
  • TT: I remember seeing this functionality but it wasn’t in the proposal, so I wanted to compare them and have more feedback next week.
  • CW: That’s fine. This feature won’t be done today. There’s no rush.
  • MM: What are the pipeline layout implications?
  • CW: In the pipeline layout, there’s a new “immediateDataRangeByteSize” and you say how many immediates there pipeline will use. And maybe there’s a default layout. It also says that when you bind a pipeline that doesn’t have the same size as a previous pipeline, things are not invalidated. The same way bind groups are not invalidated.
  • MM: What if I request to create a pipeline with a lot of immediates?
  • CW+KN: It is a small limit that defaults to 64bytes in the proposals but devices could expose more.
  • MM: SO the device could expose more if the application asked for it?
  • CW: Yes.
  • BJ: If we’re limiting the API shape to only accept data arrays, let’s make sure we have data offset and size as optional parameters. But that’s already there. So, yeah.
  • KN: Since this is a proposal document and not a change to the spec, we should get it landed. We should take out the u32 i32 f32 versions, and maybe make some notes about open questions, but then we can get this landed?
  • CW: Sounds good.
  • CW: We can let editors add it now.

Compatibility Mode: Support OpenGLES devices with zero SSBOs and image uniforms in vertex, fragment stages #4721

  • SW: This was the unfortunate discovery that there are devices that support 0 SSBOs and storage textures in vertex shaders. This was an oversight because gpuinfo.org doesn’t report nonzero values.

  • SW: There are a bunch of things we could do about this. My preference would be to introduce new limits. We have to figure out what to do about the new limits.

  • SW: Because the spec says if you request a limit, it has to exist, we’d have to add these to core.

  • MM: Why not say that these devices don't get WebGPU?

  • SW: That is an option: 40% of the reports on gpuinfo.org. So it’s sizable.

  • KG: That seems like a lot

  • SW: TT did the legwork of grabbing the data, which is just gpuinfo. We can’t really say for sure but there are a lot of devices.

  • TT: Sebastian’s comment on the issue indicates it’s useful to have readonly buffers in the vertex stage. <missed>

  • TT: What they found when I was looking at the report is that a lot of the devices that don’t currently support core could support compat with vulkan and lower limits. If it would be possible to experiment with this and see what the telemetry says - maybe we don’t need to introduce new limits.

  • CW: One thing is that Vulkan drivers on Android work terribly in the first few years. If these devices we can support with Vulkan on compat but they have very old drivers, I’m a bit worried they might not work. It’s not just what the Vulkan driver advertises, it’s also whether it actually works or not.

  • CW: Can we ask the Android team if we can have more data on actual number of devices for this?

  • SW: GT was trying to get that data, but it’s not public. We should see if we can get ballpark numbers at least.

  • SW: The unfortunate thing is the reason these devices work with Vulkan is because Vulkan exposes read-only access, which GL doesn’t. Even if your device could support read-only, there’d be no way for you get out of it.

  • JB: Why can these devices support readonly storage buffers but not writing to them?

  • SW: It’s something about running the shader multiple times. Mali isn’t able to do it.

  • JB: Vertex shaders are strange and you wouldn’t be able to get your writes to have a consistent behavior.

  • SW: Sounds like people aren’t willing to make a decision based on the data we have now. Should we gather more data?

  • CW: My recollection is that from the ARM performance guidelines, they split the vertex shader into two parts: the part that computes the position, and then per tile, computing the rest of it.

  • JB: The execution isn’t consistent with itself…

  • CW: We probably need more data: What coverage we’re using and what we could be getting back with a compat mode on Vulkan.

  • CW: It’d be nice if we could paper over this and say “yes, if you have WebGPU, you have read-only storage buffers in vertex shaders”

  • CW: How many devices are we ready to lose support for? We can come back with more data, but what should the cutting point be?

  • TT: I intended to bring this up as well. Looking at the reports, we excluded some devices which actually do support OGL 3.1. I’m unsure what the coverage we get from compat is for compat. I didn’t look at Adreno, though (or some other devices we’re interested in). We should add the hardware we want to target with compat to the meta-issue we have for tracking this. We have one.

  • SW: You mean WebGPU Compatibility Mode #4266?

  • TT: No. What OSes and hardware are we targeting? #1069

  • TT: I don’t have a good answer for your question currently. We need more clarity for what we’re targeting.

  • JB: It’d be nice that, when we make decisions like this, that decision would be memorialized by an edit to this list.

    KG: The thing we’re sorely missing is actual telemetry for how many of these devices are out there. Implementations, please prioritize collecting that telemetry.

  • CW: Summary: we want to see the data.

  • CW: Let’s get back to this when we have more data.

Clarify that create*PipelineAsync must not generate a validation error #4714

  • CW: (In popErrorScope())
  • JB: Mozilla encountered a number of questions we had about the spec. One thing is that in the section of the spec called “promise ordering” it explicitly says “nothing is ordered with respect to anything except for onSubmittedWorkDone() with respect to itself and mapAsync and onSubmittedWorkDone(). Is it intentional that popErrorScope isn’t ordered with create*PipelineAsync()? Those are the 2 channels the app can get errors out of pipeline creation
  • KW: I’m not sure, but I believe we didn’t intend any pipeline creation errors to come from createPipelineAsync().
  • JB: Is it deliberate that createPipelineAsync produces validation errors? According to the spec, it does, because the spec delegates createXPipelineAsync to createXPipeline, which produces validation errors. But since the idea behind *Async is that you could kick it off onto a separate compilation thread, it would make more sense if the Async one is a free-running thing, and it’s difficult to place those errors in order with respect to other device timeline activity.
  • KN: I’m pretty sure the intent was that we’d catch any validation errors that came from there, and not let them escape from the async call. For mapAsync() we explicitly decided it would do both.
  • JB: Makes sense.
  • JB: Final question: In the specification for createPipelineAsync() there is a single block of device timeline stuff, in a single block. We have this thing that starts an async compilation - presumably running on a thread, but it’s technically completely ordered with respect to other device timeline operations.
  • KN: It says “when pipeline is ready to be used” but it should say “when pipeline is ready to be used, go do these other timeline steps, which will then issue steps on the content timeline”. It says “set an event handler when this happens, go run the content timeline steps” I don’t know if it makes sense that the device stuff should be split, but we could…
  • JB: Once we clarify that createRenderPipelineAsync does not raise validation errors, then its device timeline block doesn’t have any side effects, so implementations are free to run it any time they like. Then the leeway in scheduling the subsequent content timeline steps is enough to permit off-thread implmentations. (JimB: This isn’t right - if two device timeline blocks each trigger a content timeline block, the content blocks must run in the same order that their device blocks ran in. So the createXPipelineAsync device timeline block must be split. I’ll comment in the issue)
  • CW: Under the “as-if” rule - if the device is lost, all the steps might work, but you should still get a rejection.
  • KN: It might reject, rather.
  • CW: My understanding is that there’s a clarification to be done, but also this issue is about “put the behavior up on the ballot again of whether it should reject or not” Are we fine keeping it with just rejection.
  • JB: The promise has everything the caller needs - it shouldn’t also produce error scope errors
  • KN: It’s more simpler to implement and simpler to use.
  • CW: Okay.
  • JB: It’s a change in the spec.
  • CW: Okay.
  • KN: It’s a change to the spec and a clarification.

mapAsync early-reject doesn't generate a validation error #4690

  • CW: If you call mapAsync and there’s a validation error, you get a promise rejection and and an error in your error scope. But there’s a way to make the promise reject without an error in the error scope, which is to early-reject the mapping. That’s not consistent. What should we do?
  • KN: Since this was intentional that it should produce both types of errors, then both activities should do both
  • CW: The rationale for the error in the error scope for the validation is that a well-formed application should be able to use mapAsync without producing validation errors. If you do something wrong (mapped something too big) you get an error in the error scope. But cancelling a mapping is a correct well-behaved thing to do. So it’s unfortunate to ask apps to wrap things in an error scope for behavior that might not be a programming error, or OOM or internal error.
  • CW: Taking this perspective, it’s probably better to keep the consistency. It’s not a programmer error. It is a feature of the API that early-rejection can happen.
  • KN: Good point. There are possibly some cases where you might intentionally violate the other validation. Maybe you’d want to redundantly map something. But it would give you a new promise, so it would be messy if you wanted to do that. So it’s a pretty clean line to draw.
  • CW: If we decide to keep the spec as-is, maybe we can add a note to the spec, describing “this doesn’t do that because it’s not a programmer error”
  • KN: Yeah.
  • CW: Everyone okay with this?
  • <all>: 👍

Consider if there's a much better name for "image copies" #4573

  • KN: I refused to let this one go :) Finally came up with some new names. Table in the issue has the 6 new names. Used "data" to refer to buffer/texture data used in a copy. As opposed to a layout, which tells you the info without data. Basic thing - texelCopy, copyExternalImage.
  • KG: didn't have time to review this in Moz's internal discussions.
  • CW: come back to this later. Changing names in spec is easy; concerned about all impls having to update names. But maybe we shouldn't be concerned about this. I think the new names make sense.

Depth Bias does not affect line and point primitives. #4729

  • BJ: realized none of the APIs have depth bias affect lines or points; but this isn't reflected in the WebGPU spec. Vulkan spec says depth bias "may" affect these. 2 approaches: if we see pipeline using line/point topology, zero out these values (Corentin's suggestion) so we can say these are ignored all the time. That way we don't break any existing code. If anyone relying on the "may" in the Vulkan spec (hard to imagine) then could turn this into a validation error. Using depth/point topology, then these depth bias values in the pipeline have to be zero. My preference: these will be ignored all the time, so we can just zero them out.
  • GT: that error would help me know why my stuff wasn't working. :)
  • MM: there's a porting issue here. There'll be apps ported from other APIs to WebGPU, they might have this stuff set and expect it does nothing. So I think an error isn't the right thing. Browser could print a message to the console instead.
  • BJ: agree. This would've helped me find issues in my code too. Could print warning to console.
  • KG: porting libraries/efforts - not a big deal to just zero these things out. Better to be explicit.
  • KN: weak preference for a warning. Think a lot of our validation's onerous. If you set this thing and this other thing must not be set - not good - better to say, these things have to be set to zero. I'm OK either way.
  • BJ: also weak preference for the warning; could live with validation error. Like KG's suggestion to have option to re-enable this in the future. Validation error would work for this - have a feature in the future, remove the validation error. If it's a warning - any migration pitfalls? Turn on this feature, something breaks that you didn't know was there before? Warning scenario would spam warnings to the console.
  • CW: argument to be able to re-enable this in the future is probably the deciding factor to raise a validation error here. Worth breaking 1-2 websites that use this.
  • MM: so today those websites would work OK on D3D and Metal, and sometimes on Vulkan? And we break them all? That sounds bad.
  • KG: think this isn't a very important use case. But would save developers like Gregg who were scratching their head.
  • CW: either warning or error fixes that. Bar for making breaking changes isn't that high right now while there aren't multiple WebGPU impls in the field. Chromium's OK making this change because this is a tiny change.
  • KN: we're removing requestAdapterInfo; that's bigger. More of a trial right now, "can we do this"? This one isn't major. We can collect telemetry, deprecate it. Think it's fine to do. And if we see lots of people using this, we'll know.
  • CW: OK to go with hard error for this?
  • <consensus: yes>

(Bonus topic!) maxInterStageShaderComponents seems to overlap with maxInterStageShaderVariables #4688

  • CW: basically, why do we keep maxInterStageShaderVariables? Similar to earlier discussion, it's a breaking change. This one is scarier.
  • KN: might not be that bad. If we take it out, you'll get "undefined" in JS. Could say that if you pass "undefined" in the limit request, that's fine - make the removal process smoother. Would want to gather metrics for people accessing this - it's tricky because you'll access it if you just iterate the keys. Harder to do the deprecation metrics stuff.
  • CW: is deprecating this something the group would like to consider?
  • MW: my concern is carrying this forward for the next 20 years. :)
  • KG: could be better for sure.
  • MM: "unknown keys" is to prevent typos. Maybe a solution is “don’t allow unknown keys except this one
  • CW: if you set an unknown key to "undefined" it's an error. That's Kai's proposed spec tweak.
  • KN: believe we don't currently let you specify unknown keys with "undefined". I had a different proposal open that would have allowed it: https://github.com/gpuweb/gpuweb/pull/4613/files#diff-40cc3a1ba233cc3ca7b6d5873260da9676f6ae20bb897b62f7871c80d0bda4e9R2825

Hardware feature levels (and compat) #4656

Agenda for next meeting

  • (at some point) Presentation about GPU memory isolation from UCSC
  • Pacific time meeting?
    • Push constant proposal, other things
    • Next week?
    • <yes>
  • Any more agenda items, please add them here.
    • createTexture does not validate viewFormats against usage #4426
    • The bgra8unorm-storage feature should most likely only enable write-only access #4377
Clone this wiki locally