-
Notifications
You must be signed in to change notification settings - Fork 329
Description
This is based on #700 and additional insights that @shaoboyan provided offline but also on conversations with many other participants in the group (@jdashg @kenrussell @kvark, @austinEng, etc). It is also related to #625.
Rational
It is important that WebGPU is able to interoperate nicely with the rest of the Web platform and in many cases it means sharing image data with other Web platform object. Copying of image data is expensive, especially if done every frame, so we should have an efficient way to do get image data in (and maybe out) of WebGPU. There are many use cases:
- Compositing a video stream from an
HTMLVideoElement
(or WebRTC?) with additional effects using WebGPU. - Running compute-shader based machine learning pipelines on a video stream.
- Rendering the UI of your WebGPU application using 2D canvas and compositing it with the rest of the 3D scene.
- Importing data from
HTMLImageElement
to load the textures for your application (although that's not done every frame). - Interoperating between WebGL and WebGPU inside the same application by sharing images (via
HTMLCanvasElement
or a new mechanism to be invented).
A long time ago the group agreed that a path forward could be GPUQueue.copyImageBitmapToTexture
which is like a copyTextureToTexture
except that one of the texture is actually an ImageBitmap
. Part of the idea was that ImageBitmap
would eventually represent resources that are most probably resident on the GPU. This isn't really the case in Firefox that doesn't have any ImageBitmap
backend by GPU memory at this time, nor in Chromium where there's a lot of fallbacks happening that cause the ImageBitmap
to be CPU-backed (and when on the GPU it might be on different devices or on different APIs). I don't know how ImageBitmaps
work in Safari.
So if ImageBitmaps
were always backed by GPU memory that can be used for an efficient texture-to-texture copy to a GPUTexture
(on its specific GPU device in multi-GPU systems) and if producing ImageBitmaps
didn't involve copies (because its source was already on the GPU, so it would just take a reference), then copyImageBitmapToTexture
would always be a one-copy path, that can sometimes copy through a shader to take care of y-flip or un/premultiplying the alpha channel.
@shaoboyan spent a lot of time studying how to implement an efficient copyImageBitmapToTexture
and ran into many blockers:
ImageBitmap
is supposed to take a snapshot of its source "for free" but this isn't always the case, and can be actually be very difficult for things like video decoder outputs.ImageBitmap
might not be allocated in a way that makes it possible for the WebGPU implementation to copy from it directly (e.g. on the CPU, on a different GPU, or on the same GPU with an incompatible driver).- The content of the
ImageBitmap
is immutable but it can be hard to ensure it is readable by multiple users of it at once (for example the DOM andcopyImageBitmapToTexture
).
Allowing more than ImageBitmap
WebGL can do an operation similar to copyImageBitmapToTexture
by passing various types of object to the gl.texImage2D
family of operation. It uses the concept of a TexImageSource
that's defined like this:
typedef (ImageBitmap or
ImageData or
HTMLImageElement or
HTMLCanvasElement or
HTMLVideoElement or
OffscreenCanvas) TexImageSource;
As discussed above forcing data to go through ImageBitmap
can often add an extra copy, so we could add direct copies from other texture sources. WebGPU likely doesn't need to support the full breadth of these types, for example ImageBitmap
, ImageData
, and HTMLImageElement
could just be ImageData
, but it seem important to have direct support for objects producing data every frame. This means that the prototype for copyImageBitmapToTexture
becomes:
typedef (ImageBitmap or
HTMLCanvasElement or
HTMLVideoElement or
OffscreenCanvas) GPUTextureSource;
dictionary GPUTextureSourceView {
required GPUTextureSource source;
GPUOrigin2D origin = {};
// Other arguments like Y-flip, un/premultiply alpha?
};
partial interface GPUQueue {
void copyFromTextureSource(
GPUTextureSourceView source,
GPUTextureCopyView destination,
GPUExtent3D copySize);
};
Importing textures instead of copying them
Another problem with copyImageBitmapToTexture
is that it always forces at least one copy, as the name implies. Often developers want to have all of the texture data from the texture source, for example when compositing a canvas / video, or when using HTMLImageElement
to decode image assets. In the initial discussions about copyImageBitmapToTexture
we agreed to come back and find a solution that could potentially avoid that copy.
The biggest problem with importing a GPUTexture
directly from the texture source is that depending on the environment (browser, OS, but also video or image file encoding), the optimal GPUTextureDescriptor
to use isn't obvious. For example an image or a video can be using 8bit per pixels, or more. Maybe requesting the OUTPUT_ATTACHMENT
usage is free, but maybe it will require a copy because the texture source wasn't previously allocated with that usage, etc.
Here's a proposal for what a GPUDevice.importTexture
could look like:
// Options that can be used when importing a texture. It is separated
// from GPUTextureImportDescriptor so that it can be returned by
// GPUDevice.getOptimalImportOptions.
interface GPUTextureImportOptions {
// SAMPLED and COPY_SRC are always guaranteed
GPUTextureUsage usage;
GPUTextureFormat format;
// Other things like Y-flip, un/pre-multiplyalpha.
};
interface GPUTextureImportDescriptor : GPUTextureImportOptions {
GPUTextureSource source;
GPUExtent3D size; // Can always be known for all GPUTextureSource types
};
partial interface GPUDevice {
// Might need to be asynchronous?
GPUTextureImportOptions getOptimalImportOptions(GPUTextureSource source);
// Will work even when `desc` doesn't match the result of
// `getOptimalImportOptions` by performing implicit conversions etc.
GPUTexture importTexture(GPUTextureImportDescriptor desc);
};
Using it to import canvas data during a frame of a WebGPU application could look like this:
function frame() {
drawUI(canvas);
const texture = device.importTexture({
source: canvas,
size: [canvas.width, canvas.height],
usage: GPUTextureUsage.SAMPLED,
format: `rgba8unrom`,
});
compositeUI(texture);
texture.destroy();
}
If the developer wants to ensure a more optimal path portably they could do the following:
function frame() {
drawUI(canvas);
const bestOptions = device.getOptimalImportOptions(canvas);
if (!canSupportCanvasFormat()) {
bestOptions.format = 'rgba8unorm';
}
assert(bestOptions.usage & GPUTextureUsage.SAMPLED);
const texture = device.importTexture({
source: canvas,
size: [canvas.width, canvas.height],
usage: GPUTextureUsage.SAMPLED,
format: bestOptions.format,
});
compositeUI(texture);
texture.destroy();
}
There are many open questions:
- What happens for mutable images once they are imported (like
HTMLCanvasElement
), are they teared from their canvas and replaced with an empty canvas image, or are they tagged for copy-on-write? - Video decoders can have a fixed number of output buffers, what happens if WebGPU has references on all of them? Is the video decoder blocked from making progress? Maybe WebGPU is only allowed a single frame and the previous frame gets detached (like
GPUTexture.destroy()
) when the next one is queried? - Can we guarantee textures will have
textureComponentType: 'float'
or should we plan for other types of textures to come through, like depth textures? - How long do are the results of
getOptimalImportOptions
valid? Can they stay valid forever for a canvas as long as you don't touch it? What about video that might change codecs/encodings in the middle?
Hints for image producers
Another problem discussed above is that texture sources can produce data that can't efficiently be used by WebGPU. For copies that happen once like HTMLImageElement
an extra copy to make the data visible to WebGPU might be okay, but for copies happening every frame like for HTMLCanvasElement
or HTMLVideoElement
, there is a large performance and power-consumption cost.
There should be a mechanism, either visible to developers, either implementation details, that help texture sources produce data directly visible to WebGPU. There's been several potential solutions discussed in the past:
- Adding a
forGPUDevice
attribut in theImageBitmap
descriptor that hints it will be used for a specificGPUDevice
. - Similarly adding a property on
HTMLCanvasElement
orHTMLVideoElement
and other sources to tell them they will be used on aGPUDevice
. - Having a feedback mechanism in browsers that tells a texture source to change how it produces texture after it has been used to import on a
GPUDevice
multiple times.
Future: exporting WebGPU textures?
This proposal only discusses how to efficiently import texture in WebGPU, but in the future we could imagine that some API will want to import WebGPU texture efficiently. How would that work?
An idea could be to add an exportable
boolean to GPUTextureDescriptor
that adds extra restriction but allows calling a ImageBitmap GPUTexture.export()
method that from the point of view of the WebGPU API is the same as GPUTexture.destroy()
but produces an ImageBitmap with the content of that texture.