-
Notifications
You must be signed in to change notification settings - Fork 345
Description
YADUP (Yet Another Data Upload Proposal)
Last meeting there seemed to be appetite for asynchronous mapping that would allow requesting subranges but the group wanted to see a more fleshed out proposal.
This version of data upload is very similar to what we have in the spec today with mapWriteAsync and mapReadAsync but the resolution of the mapping promise doesn't give an ArrayBuffer, instead it stores the ArrayBuffer in an internal slot of the GPUBuffer and there's a GPUBuffer.getMappedRange method that allows getting subranges of the internal ArrayBuffer.
This is close the @kainino0x's old GPUMappedMemory idea.
Proposal
partial interface GPUBufferUsage {
const GPUBufferUsageFlags MAP_READ = 0x0001;
const GPUBufferUsageFlags MAP_WRITE = 0x0002;
};
partial interface GPUBuffer {
Promise<void> mapAsync();
ArrayBuffer getMappedRange(unsigned long offset = 0, unsigned long size = 0);
void unmap();
}
partial dictionary GPUBufferDescriptor {
boolean mappedAtCreation = false;
};Calling GPUBuffer.mapAsync is an error if the buffer is not valid or if it is not in the "unmapped" state (which means it is not destroyed either). Upon error mapAsync returns a promise that will reject. Upon success mapAsync puts the buffer in the "mapping" state and returns a promise that when it resolves, will put the buffer in the "mapped" state.
Calling GPUBuffer.getMappedRange, if the buffer is not in the "mapped" state, return null. If called in the "mapped" state it returns a new ArrayBuffer that's a view into the content of the buffer at range [offset, offset + size[ (obviously there's a JS exception on a bad range check). size and offset default to 0, and a size of 0 means the remaining size of the buffer after offset: buffer.getMappedRange returns the whole range.
Calling GPUBuffer.unmap is an error if the buffer is not valid or if it is in the unmapped state. On success:
- if the buffer is in the "mapping" state, then the promise is rejected and the buffer put in the "unmapped" state
- if the buffer is in the "mapped" state, all
ArrayBuffersreturned byGPUBuffer.getMappedRange()are detached and the buffer if put in the "unmapped" state
Note that modifications to the content of ArrayBuffer returned by getMappedRange are semantically modifications of the content of the buffer itself.
Calling GPUDevice.createBuffer with descriptor.mappedAtCreation can be done even if descriptor.usage doesn't contain the MAP_READ or MAP_WRITE flags. If mappedAtCreation is true, the buffer is created in the "mapped" and its content modified before unmap() and other uses like in a queue.submit().
As usual, other uses of GPUBuffer like in a GPUQueue.submit() would validate that the buffer is in the "unmapped" state. And similar to other proposals there would be restrictions on the usages that can be used in combination with MAP_READ and MAP_WRITE. Contrary to other proposals MAP_READ and MAP_WRITE could be set at the same time, and I suggest the following rules:
- If
MAP_WRITEis presentCOPY_SRCis allowed. - If
MAP_READis present,COPY_DSTis allowed. - If
MAP_READandMAP_WRITEare present, then bothCOPY_SRCandCOPY_DSTare allowed. - (example for a UMA feature) if the adapter is UMA, then if
MAP_WRITEis present, thenVERTEXandUNIFORMare also allowed.
This mapping mechanism would live side-by-side with a writeToBuffer path.
There's also threading constraints that all calls to getMappedRange and unmap() must be in the same worker so ArrayBuffers can be detached.
Alternatives choices
A single mapAsync is present instead of mapWriteAsync and mapReadAsync. The proposal talks about the ArrayBuffer being the content of the GPUBuffer directly, so it was a bit weird to have
two map functions. The downside if that if the implementation can't wrap shmem in a GPU resource:
- either a copy will have to take place on
unmap()even forMAP_READbuffers to update the content with writes the application did in theArrayBuffer - or range-tracking needs to happen for
MAP_READbuffers so the implementation knows what to overwrite
It could be possible to not return a promise from mapAsync and instead make the GPUBuffer itself act like a promise with a .then method and maybe a synchronous "state" member.
The assumption is that multi-process browsers will allocate one large shmem corresponding to the whole size of mapped buffers, so multiple ArrayBuffers could look at the same memory and overlap. If we don't want to force one large continuous allocation, getMappedRange could enforce that the ranges are all disjoint between calls to unmap.