Proposal for importing Web platform images in WebGPU

This is based on [#700](https://github.com/gpuweb/gpuweb/issues/700) and additional insights that @shaoboyan provided offline but also on conversations with many other participants in the group (@jdashg @kenrussell @kvark, @austinEng, etc). It is also related to [#625](https://github.com/gpuweb/gpuweb/issues/625).

## Rational

It is important that WebGPU is able to interoperate nicely with the rest of the Web platform and in many cases it means sharing image data with other Web platform object. Copying of image data is expensive, especially if done every frame, so we should have an efficient way to do get image data in (and maybe out) of WebGPU. There are many use cases:

 - Compositing a video stream from an `HTMLVideoElement` (or WebRTC?) with additional effects using WebGPU.
 - Running compute-shader based machine learning pipelines on a video stream.
 - Rendering the UI of your WebGPU application using 2D canvas and compositing it with the rest of the 3D scene.
 - Importing data from `HTMLImageElement` to load the textures for your application (although that's not done every frame).
 - Interoperating between WebGL and WebGPU inside the same application by sharing images (via `HTMLCanvasElement` or a new mechanism to be invented).

A long time ago the group agreed that a path forward could be `GPUQueue.copyImageBitmapToTexture` which is like a `copyTextureToTexture` except that one of the texture is actually an `ImageBitmap`. Part of the idea was that `ImageBitmap` would eventually represent resources that are most probably resident on the GPU. This isn't really the case in Firefox that doesn't have any `ImageBitmap` backend by GPU memory at this time, nor in Chromium where there's a lot of fallbacks happening that cause the `ImageBitmap` to be CPU-backed (and when on the GPU it might be on different devices or on different APIs). I don't know how `ImageBitmaps` work in Safari.

So if `ImageBitmaps` were always backed by GPU memory that can be used for an efficient texture-to-texture copy to a `GPUTexture` (on its specific GPU device in multi-GPU systems) and if producing `ImageBitmaps` didn't involve copies (because its source was already on the GPU, so it would just take a reference), then `copyImageBitmapToTexture` would always be a one-copy path, that can sometimes copy through a shader to take care of y-flip or un/premultiplying the alpha channel.

@shaoboyan spent a lot of time studying how to implement an efficient `copyImageBitmapToTexture` and ran into many blockers:

 - `ImageBitmap` is supposed to take a snapshot of its source "for free" but this isn't always the case, and can be actually be very difficult for things like video decoder outputs.
 - `ImageBitmap` might not be allocated in a way that makes it possible for the WebGPU implementation to copy from it directly (e.g. on the CPU, on a different GPU, or on the same GPU with an incompatible driver).
 - The content of the `ImageBitmap` is immutable but it can be hard to ensure it is readable by multiple users of it at once (for example the DOM and `copyImageBitmapToTexture`).

## Allowing more than `ImageBitmap`

WebGL can do an operation similar to `copyImageBitmapToTexture` by passing various types of object to the `gl.texImage2D` family of operation. It uses the concept of a `TexImageSource` that's defined like this:

```webidl
typedef (ImageBitmap or
         ImageData or
         HTMLImageElement or
         HTMLCanvasElement or
         HTMLVideoElement or
         OffscreenCanvas) TexImageSource;
```

As discussed above forcing data to go through `ImageBitmap` can often add an extra copy, so we could add direct copies from other texture sources. WebGPU likely doesn't need to support the full breadth of these types, for example `ImageBitmap`, `ImageData`, and `HTMLImageElement` could just be `ImageData`, but it seem important to have direct support for objects producing data every frame. This means that the prototype for `copyImageBitmapToTexture` becomes:

```webidl
typedef (ImageBitmap or
         HTMLCanvasElement or
         HTMLVideoElement or
         OffscreenCanvas) GPUTextureSource;

dictionary GPUTextureSourceView {
    required GPUTextureSource source;
    GPUOrigin2D origin = {};
    // Other arguments like Y-flip, un/premultiply alpha?
};

partial interface GPUQueue {
    void copyFromTextureSource(
        GPUTextureSourceView source,
        GPUTextureCopyView destination,
        GPUExtent3D copySize);
};
```

## Importing textures instead of copying them

Another problem with `copyImageBitmapToTexture` is that it always forces at least one copy, as the name implies. Often developers want to have all of the texture data from the texture source, for example when compositing a canvas / video, or when using `HTMLImageElement` to decode image assets. In the initial discussions about `copyImageBitmapToTexture` we agreed to come back and find a solution that could potentially avoid that copy.

The biggest problem with importing a `GPUTexture` directly from the texture source is that depending on the environment (browser, OS, but also video or image file encoding), the optimal `GPUTextureDescriptor` to use isn't obvious. For example an image or a video can be using 8bit per pixels, or more. Maybe requesting the `OUTPUT_ATTACHMENT` usage is free, but maybe it will require a copy because the texture source wasn't previously allocated with that usage, etc.

Here's a proposal for what a `GPUDevice.importTexture` could look like:

```webidl
// Options that can be used when importing a texture. It is separated
// from GPUTextureImportDescriptor so that it can be returned by
// GPUDevice.getOptimalImportOptions.
interface GPUTextureImportOptions {
    // SAMPLED and COPY_SRC are always guaranteed
    GPUTextureUsage usage;
    GPUTextureFormat format;
    // Other things like Y-flip, un/pre-multiplyalpha.
};

interface GPUTextureImportDescriptor : GPUTextureImportOptions {
    GPUTextureSource source;
    GPUExtent3D size; // Can always be known for all GPUTextureSource types
};

partial interface GPUDevice {
    // Might need to be asynchronous?
    GPUTextureImportOptions getOptimalImportOptions(GPUTextureSource source);

    // Will work even when `desc` doesn't match the result of
    // `getOptimalImportOptions` by performing implicit conversions etc.
    GPUTexture importTexture(GPUTextureImportDescriptor desc);
};
```

Using it to import canvas data during a frame of a WebGPU application could look like this:

```js
function frame() {
    drawUI(canvas);

    const texture = device.importTexture({
        source: canvas,
        size: [canvas.width, canvas.height],
        usage: GPUTextureUsage.SAMPLED,
        format: `rgba8unrom`,
    });
    
    compositeUI(texture);
    texture.destroy();
}
```

If the developer wants to ensure a more optimal path portably they could do the following:

```js
function frame() {
    drawUI(canvas);

    const bestOptions = device.getOptimalImportOptions(canvas);
    if (!canSupportCanvasFormat()) {
        bestOptions.format = 'rgba8unorm';
    }
    assert(bestOptions.usage & GPUTextureUsage.SAMPLED);

    const texture = device.importTexture({
        source: canvas,
        size: [canvas.width, canvas.height],
        usage: GPUTextureUsage.SAMPLED,
        format: bestOptions.format,
    });
    
    compositeUI(texture);
    texture.destroy();
}
```

There are many open questions:

 - What happens for mutable images once they are imported (like `HTMLCanvasElement`), are they teared from their canvas and replaced with an empty canvas image, or are they tagged for copy-on-write?
 - Video decoders can have a fixed number of output buffers, what happens if WebGPU has references on all of them? Is the video decoder blocked from making progress? Maybe WebGPU is only allowed a single frame and the previous frame gets detached (like `GPUTexture.destroy()`) when the next one is queried?
 - Can we guarantee textures will have `textureComponentType: 'float'` or should we plan for other types of textures to come through, like depth textures?
 - How long do are the results of `getOptimalImportOptions` valid? Can they stay valid forever for a canvas as long as you don't touch it? What about video that might change codecs/encodings in the middle?

## Hints for image producers

Another problem discussed above is that texture sources can produce data that can't efficiently be used by WebGPU. For copies that happen once like `HTMLImageElement` an extra copy to make the data visible to WebGPU might be okay, but for copies happening every frame like for `HTMLCanvasElement` or `HTMLVideoElement`, there is a large performance and power-consumption cost.

There should be a mechanism, either visible to developers, either implementation details, that help texture sources produce data directly visible to WebGPU. There's been several potential solutions discussed in the past:

 - Adding a `forGPUDevice` attribut in the `ImageBitmap` descriptor that hints it will be used for a specific `GPUDevice`.
 - Similarly adding a property on `HTMLCanvasElement` or `HTMLVideoElement` and other sources to tell them they will be used on a `GPUDevice`.
 - Having a feedback mechanism in browsers that tells a texture source to change how it produces texture after it has been used to import on a `GPUDevice` multiple times.

## Future: exporting WebGPU textures?

This proposal only discusses how to efficiently import texture in WebGPU, but in the future we could imagine that some API will want to import WebGPU texture efficiently. How would that work?

An idea could be to add an `exportable` boolean to `GPUTextureDescriptor` that adds extra restriction but allows calling a `ImageBitmap GPUTexture.export()` method that from the point of view of the WebGPU API is the same as `GPUTexture.destroy()` but produces an ImageBitmap with the content of that texture.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal for importing Web platform images in WebGPU #1154

Rational

Allowing more than `ImageBitmap`

Importing textures instead of copying them

Hints for image producers

Future: exporting WebGPU textures?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal for importing Web platform images in WebGPU #1154

Description

Rational

Allowing more than ImageBitmap

Importing textures instead of copying them

Hints for image producers

Future: exporting WebGPU textures?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Allowing more than `ImageBitmap`