-
Notifications
You must be signed in to change notification settings - Fork 345
Proposal for buffer mapping via WebGPUMappedMemory
#49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
726f0c2
3390f7d
9a86d48
2e6bd0f
845ed19
90c7764
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,150 @@ | ||
| # Buffer operations | ||
|
|
||
| This describes operations that are done on `WebGPUBuffer` objects directly and are not buffered inside a `WebGPUCommandBuffer`. | ||
| The two primitives we need to support are the CPU writing data inside the buffer for use by the GPU (upload) and the CPU reading data produced by the GPU (readback). | ||
|
|
||
| Design constraints are: | ||
|
|
||
| - For the portability of the API, prevent data races between the CPU and the GPU. | ||
| - For performance, minimize the number of times the data is copied around. | ||
| - To make the API non-blocking, only allow asynchronous readbacks. | ||
| - For performance on multi-process implementations, make an asynchronous upload path. | ||
|
|
||
| ## Buffer mapping | ||
|
|
||
| ### map[Write|Read] and unmap | ||
|
|
||
| The way to have the minimal number of copies for upload and readback is to provide a buffer mapping mechanism. | ||
| This mechanism has to be asynchronous to ensure the GPU is done using the buffer before the application can look into the ArrayBuffer. | ||
| Otherwise on implementation where the ArrayBuffer is directly a pointer to the buffer memory, data races between the CPU and the GPU could occur. | ||
|
|
||
| We want the status of a map operation to act as both a promise, and something that's pollable as there are advantages to both. | ||
| `WebGPUMappedMemory` is an object that is `then`-able, meaning that it acts like a Javascript `Promise` but is pollable at the same time. | ||
|
|
||
| The mapping operations for `WebGPUBuffer` are: | ||
|
|
||
| ``` | ||
| partial interface WebGPUBuffer { | ||
| WebGPUMappedMemory mapWrite(u32 offset, u32 size); | ||
| WebGPUMappedMemory mapRead(u32 offset, u32 size); | ||
| }; | ||
| ``` | ||
|
|
||
| These operations return new `WebGPUMappedMemory` objects representing the current range of the buffer for writing or mapping. | ||
| The results are initialized in the "pending" state and transition at Javascript task boundary to the "available" state when the implementation can determine the GPU is done using the buffer. | ||
| Calling `mapRead` or `mapWrite` puts the buffer in the mapped state. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See #49 (comment) . mapRead/mapWrite move the WebGPUBuffer object from Unmapped to Mapped. They return a new WebGPUmappedMemory object which starts in the Pending state. |
||
| No operations are allowed in a buffer in that state except additional calls to `mapRead` or `mapWrite` and calls to `unmap`. | ||
| In particular a mapped buffer cannot be used in a `WebGPUCommandBuffer` given to `WebGPUQueue.submit`. | ||
| The following must be true or a validation error occurs for `mapWrite` (resp. `mapRead`): | ||
|
|
||
| - The buffer must have been created with the `WebGPUBufferUsage.MAP_WRITE` (resp. `WebGPUBufferUsage.MAP_READ`) usage. | ||
| - `offset + size` must not overflow and be at most the size of the buffer | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about overlaps of previous mappings of the same resource?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is still unresolved; we started discussing it here: #49 (comment), at least. |
||
| - Depending on the design of memory barriers, the buffer must be, or allowed to be in the `WebGPUBufferUsage.MAP_WRITE` (resp. `WebGPUBufferUsage.MAP_READ`) usage. | ||
|
|
||
| Then a mapped buffer can be unmapped with: | ||
|
|
||
| ``` | ||
| partial interface WebGPUBuffer { | ||
| void unmap(); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't seem to appear in the idl file.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this on the buffer and not on the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Double checked, this correctly appears in the IDL or am I missing something? Our original idea had both
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think WebGPUMappedMemory.unmap is needed if we do allow overlapping mappings. |
||
| }; | ||
| ``` | ||
|
|
||
| This operation invalidates all the `WebGPUMappedMemory` created from the buffer and puts the buffer in the unmapped state. | ||
| The buffer must be in the mapped state otherwise a validation error occurs when `unmap` is called. | ||
|
|
||
| ### WebGPUMappedMemory | ||
|
|
||
| `WebGPUMappedMemory` is an object representing a mapped region of a buffer that's both pollable and promise-like. | ||
|
|
||
| It can be in one of three states: pending, available and invalidated. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: On line 52, you mention that calling unmap puts the buffer into the "unmapped" state. However, here, "unmapped" is not listed as one of the three states.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are the states for the WebGPUMappedMemory. Mapped vs. unmapped is a WebGPUBuffer state. See the state diagrams linked in the other comment. |
||
|
|
||
| The pollable interface is: | ||
|
|
||
| ``` | ||
| partial interface WebGPUMappedMemory { | ||
| bool isPending(); | ||
| ArrayBuffer getPointer(); | ||
| }; | ||
| ``` | ||
|
|
||
| `isPending` return true if the object is in the pending state, false otherwise. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this method useful, considering that one can just compare
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| `getPointer` returns an ArrayBuffer representing the buffer data if the object is in the available state, null otherwise. | ||
|
|
||
| `WebGPUMappedMemory` is also `then`-able, meaning that it acts like a Javascript `Promise`: | ||
|
|
||
| ``` | ||
| partial interface WebGPUMappedMemory { | ||
| Promise then(WebGPUMappedMemorySuccessCallback success, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't we make it extend Promise or something? Instead of copying a signature from Promise
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have been thinking about this. My first pass at the idea was that it may not be possible, because Promise is not an interface. But I didn't try running it through a validator.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
My 30 second exploration seems to indicate it doesn't work. |
||
| optional WebGPUMappedMemoryErrorCallback error); | ||
| }; | ||
| ``` | ||
|
|
||
| This acts like a `Promise<ArrayBuffer>.then` that is resolved on the Javascript task boundary in which the implementation detects the GPU is done with the buffer. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens when you mix these APIs? Like if you call map inside the .then() block?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I'm not sure what the problem is since WebGPUBuffer.map will return to you a new WebGPUMappedMemory that cannot be resolved before the end of this task.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The semantics here ought to be very similar to Promise's. @litherum, if you thought of a particular construction that might be ambiguous, can you provide a simple example? |
||
| On that boundary: | ||
|
|
||
| - The `WebGPUMappedMemory` goes in the available state. | ||
| - If the `WebGPUMappedMemory` was created via `WebGPUBuffer.mapWrite`, its content is cleared to 0. | ||
| - `success` is called with the content of the memory as an argument. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the rationale for having the Promise give you an ArrayBuffer in the writing (mapWrite or setSubData) case? When would I want my own data given back to me?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Apologies if I didn't make myself clear in what I was asking. Suppose I call mapWrite, .then the promise and return control back to Javascript. When the promise resolves, what am I meant to do with the Similar question applies to the readback case. Will the Promise
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes both the promise's |
||
|
|
||
| If `success` hasn't been called when the WebGPUMappedMemory gets invalidated (meaning the object is still in the pending state), `error` is called instead. When `WebGPUMappedMemory` goes from the available state to the invalidated state, the `ArrayBuffer` for its content gets neutered. The return value of `then` acts like the return value of `Promise.then`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What circumstances would cause this to occur? Why wouldn't it just wait forever until the buffer becomes available? |
||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using promises for buffers that change state is tricky because you can't change the state of a promise once it has been resolved. A promise can have multiple What is the validity lifetime of the ArrayBuffers given to promise revolves in the multiple
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, I now remember we mentioned something like that in a meeting. A potential solution was for
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It can also behave like
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although if
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My questions was more about what "validity" means in the context of buffers and the event loop. For readback, if multiple pieces of code
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It never happens automatically, only on
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for the clarification, @kainino0x . That means if the developer calls
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we could sidestep this problem by using callbacks directly. There is no timing problem because the earliest the callback could be run is at the next micro task boundary. It also solves the problem of code reusability (due to Promises not being able to be re-used once they are fulfilled).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there would be pushback from the TAG, but it's a possibility. |
||
| The `ArrayBuffer` of a `WebGPUMappedMemory` created from a `mapWrite` is where the application should write the data and its content is made available to the buffer when the `WebGPUMappedMemory` is invalidated (i.e. `WebGPUBuffer.unmap` is called). | ||
|
|
||
| ## Immediate data upload | ||
|
|
||
| Buffer mapping is the path with the least number of copies but it is often useful to upload data to a buffer *right now*, if only for debugging. | ||
| A `WebGPUBuffer` operation is provided that takes an ArrayBuffer and copies its content at an offset in the buffer. | ||
|
|
||
| ``` | ||
| partial interface WebGPUBuffer { | ||
| void setSubData(ArrayBuffer data, u32 offset); | ||
| } | ||
| ``` | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can a buffer be TRANSFER_DST as well as MAP_WRITE? If so, what happens if web developers call To keep things simple for MVP, might be good to have buffers be either MAP_* or TRANSFER_* but not both.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A buffer cannot be While the buffer is mapped everything is disallowed but You really need to allow buffers that are |
||
|
|
||
| This operation acts as if it was done after all previous "device-level" commands and before all subsequent "device-level" commands. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does it know what commands are "previous" and what are "subsequent"?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here previous and subsequent refer to the order of the calls in the Javascript program order. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that's not very well-defined if we'll allow submission of commands from multiple workers. IIRC, Vulkan does such "in-place" uploads via command buffers. Doing them the same way would make it a bit clearer what commands a before the upload and what are after (IIUC, not with multiple queues, though).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we associate this with a particular point in the queue, perhaps by passing the queue as an argument, or making this a member function on the queue? It would make it clear what data would get clobbered and what wouldn't. Doing this would also solve the problem of what happens when this is called during the time when a buffer is mapped. We've heard from Metal developers that submitting commands in an order different than they were recorded is an important use-case. |
||
| "Device level" commands are all commands not buffered in a `WebGPUCommandBuffer`, and include `WebGPUQueue.submit`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. instead of inventing a new "Device level" term, we could just say that the operation acts as if it's done after all device/queue level commands.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is also done after previous
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. or we can move |
||
| The content of `data` is only read during the call and can be modified by the application afterwards. | ||
| The following must be true or a validation error occurs: | ||
|
|
||
| - The buffer must have been created with the `WebGPUBufferUsage.TRANSFER_DST` usage flag. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Your document talks about
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK. |
||
| - `offset + data.length` must not overflow and be at most the size of the buffer. | ||
| - Depending on the design of memory barriers, the buffer must be, or allowed to be in the `WebGPUBufferUsage.TRANSFER_DST` usage. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. similar concern here about the mismatch of usage versus state concepts. Are they merged in NXT?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No but they are often confused there's the creation usage and the usage "state" (there's other state like the mapped state). So we never use the word "state" for usage, only "usage". A resource is in a usage means it has that usage "state" value We obviously we didn't spend enough time figuring out the naming of things so this is a bit confusing. I think the intent is clear here, what do you think of deferring the re-wording to when we decide what our memory barriers look like (if any)?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do see a bit of elegance in merging usages and access flags, but it's also confusing to differentiate is/has...
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with @kvark. The distinction confused me as well. Can buffers only be in more than one "is in" usage at a time? If so, then that usage should be a separate type that is not a bitfield. That makes it clear to the person calling functions that take "is in" usage that only one of them is valid. If the set of "has a" usages contains members that are not valid for "is in" usage or vice versa, then that's also a reason to have separate types. The description of "has a" vs. "is in" usages would be great explanatory text for the spec. |
||
| - In particular the buffer must not be currently mapped. | ||
|
|
||
| ## Unused designs | ||
|
|
||
| ### Persistently mapped buffer | ||
|
|
||
| Persistently mapped buffer are when the result of mapping the buffer can be kept by the application while the buffer is in use by the GPU. | ||
| We didn't find a way to have persistently mapped buffers and at the same time keep things data race free between the CPU and GPU. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this section will be committed, you should state that "we" is "The WebGPU Community Group" |
||
| Being data race free would be possible if ArrayBuffer could be unneutered but this is not the case. | ||
|
|
||
| ### Promise<ArrayBuffer> readback(); | ||
|
|
||
| This didn't have a pollable interface and forced an extra buffer-to-buffer copy to occur if the GPU execution could be resumed immediately. | ||
|
|
||
| ### NXT's MapReadAsync(callback); | ||
|
|
||
| Not a pollable interface. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably is worth describing why a pollable interface is a requirement (something I'm still fuzzy on myself)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One major reason is that Promises don't exist in WebAssembly, and there are costs (latency and extra JS VM spinup) to converting Promises into something that's pollable by WebAssembly. Another is that the majority of game engine architectures are main-loop-based. Even in JS, converting Promises into pollable interfaces incurs latency for engines.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Something more complete than my comment should be written and included in the markdown.) |
||
|
|
||
| ## Issues | ||
|
|
||
| ### GC discoverability | ||
|
|
||
| It isn't clear yet what happens when a buffer gets garbage collected while it is mapped. | ||
| The simple answer is that the `WebGPUMappedMemory` objects get invalidated but that would allow the application to discover when the GC runs. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wouldn't the simpler answer be "don't collect the object?" Either by bumping a reference count upon map, or by pinning the object?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point, don't know why I didn't think of it.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having buffer as an attribute of the WebGPUMappedMemory would work and is pretty simple 👍 |
||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To avoid GC discoverability, the implementation of
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's ony way, and the application can make sure the Do you think there's something we could do along the lines of allowing the buffer and the expensive BAR0 allocation to disappear but making the buffer replace the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm pretty sure the pointer isn't mutable. |
||
| ### GC pressure | ||
|
|
||
| The `WebGPUMappedMemory` design makes each mapped region create two garbage collected objects. This could lead to some GC pressure. | ||
|
|
||
| ### Side effects between mapped memory regions | ||
|
|
||
| What happens when `WebGPUMappedMemory` object's region in the buffer overlap? | ||
| Are write from one visible from the other? | ||
| If they are, maybe `WebGPUMappedMemory.getPointer` should return an `ArrayBufferView` instead. | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One of the goals of mapping subregions of the buffer is to avoid having to ask for the whole thing from the underlying API. Since you can ask for the original
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks I didn't know that. This might mean that we need to require mapped ranges to be disjoint but that's a pretty big restriction. @jdashg's idea of more explicit "client buffers" become more attractive. |
||
| ### Interactions with workers | ||
|
|
||
| Can a buffer be mapped in multiple different workers? | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should probably have a (another?) larger discussion about workers.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes |
||
| If that's the case, the pointer should be represented with a `SharedArrayBuffer`. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this or a subsequent pull request, I suggest making a state diagram that illustrates all of the states a
WebGPUBuffercan be in (available, pending, mapped, unmapped) along with the operations that move it between states (calling unmap, calling map, GPU writes finished, JavaScript task boundary, etc)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the state diagrams in this doc. I could include them here but they assume D3D12 style memory barriers.