这是indexloc提供的服务,不要输入任何密码
Skip to content

Copies investigation (+ proposals) #69

@Kangz

Description

@Kangz

Copies between buffers

This one is fairly easy.

D3D12

It goes through the CopyBufferRegion call that doesn't seem to have particular alignment constraints.

// On ID3D12GraphicsCommandList
void CopyBufferRegion(
  ID3D12Resource *pDstBuffer,
  UINT64         DstOffset,
  ID3D12Resource *pSrcBuffer,
  UINT64         SrcOffset,
  UINT64         NumBytes
);

Metal

// On MTLBlitCommandEncoder
func copy(
    from: MTLBuffer,
    sourceOffset: Int,
    to: MTLBuffer,
    destinationOffset: Int,
    size: Int)

On MacOS offsets and the size must be aligned to 4.

Vulkan

void vkCmdCopyBuffer(
    VkCommandBuffer                             commandBuffer,
    VkBuffer                                    srcBuffer,
    VkBuffer                                    dstBuffer,
    uint32_t                                    regionCount,
    const VkBufferCopy*                         pRegions);

typedef struct VkBufferCopy {
    VkDeviceSize    srcOffset;
    VkDeviceSize    dstOffset;
    VkDeviceSize    size;
} VkBufferCopy;

The spec doesn't have constraints alignement constraints. VkPhysicalDeviceLimits::optimalBufferCopyOffsetAlignment isn't for vkCmdCopyBuffer.

Proposal

Because of Metal we need the alignement of 4 constraint. All APIs are the same except that Vulkan can take multiple copies in one function. What happens if multiple copies write to the same location is unclear in Vulkan, so we should probably have single copies in WebGPU. The sketch.webidl looks like this which seems ok (up to the u32 that might need to be replaced with a WebGPUDeviceSize):

    void copyBufferToBuffer(WebGPUBuffer src,
                            u32 srcOffset,
                            WebGPUBuffer dst,
                            u32 dstOffset,
                            u32 size);

Validation is that:

  • src must have the transferSrc usage and dst must have the transferDst usage
  • The source and destination regions are in bounds of the resources
  • The source and destination regions don't overlap

These validation rules are also required for all other copies below.

Copies between buffers and textures

This talks about copies from buffers to textures and assumes copies from textures to buffers are symmetric.

D3D12

// This is a "view" of a resource as a texture copy location.
//  - Textures use SubresourceIndex to select a slice.
//  - Buffers use PlacedFootprint to declare how to interpret their content as a linear texture.
typedef struct D3D12_TEXTURE_COPY_LOCATION {
  ID3D12Resource          *pResource;
  D3D12_TEXTURE_COPY_TYPE Type;
  union {
    D3D12_PLACED_SUBRESOURCE_FOOTPRINT PlacedFootprint;
    UINT                               SubresourceIndex;
  };
};

typedef struct D3D12_PLACED_SUBRESOURCE_FOOTPRINT {
  UINT64                      Offset;
  D3D12_SUBRESOURCE_FOOTPRINT Footprint;
};

typedef struct D3D12_SUBRESOURCE_FOOTPRINT {
  DXGI_FORMAT Format;
  UINT        Width;
  UINT        Height;
  UINT        Depth;
  UINT        RowPitch;
};

// On ID3D12GraphicsCommandList
// Does a copy between two textures, some of which can be buffer memory viewed as a linear texture.
void CopyTextureRegion(
  const D3D12_TEXTURE_COPY_LOCATION *pDst,
  UINT                              DstX,
  UINT                              DstY,
  UINT                              DstZ,
  const D3D12_TEXTURE_COPY_LOCATION *pSrc,
  const D3D12_BOX                   *pSrcBox
);

RowPitch must be aligned to 256 (and must be able to contain a full row) and our experimentation show that Offset must be aligned to 256 too. Texture formats of the "textures" must be compatible (same number of channels, same size). Depth-stencil or multisample resources must have whole subresource copied.

Metal

// On MTLBlitCommandEncoder
func copy(
    from: MTLBuffer,
    sourceOffset: Int,
    sourceBytesPerRow: Int,
    sourceBytesPerImage: Int,
    sourceSize: MTLSize,
    to: MTLTexture,
    destinationSlice: Int,
    destinationLevel: Int,
    destinationOrigin: MTLOrigin)

Constraints are that sourceOffset, sourceBytesPerRow and sourceBytesPerImage must be aligned to the pixel format size in bytes. sourceBytesPerRow must also be less than 32767 times the pixel size in bytes (wat).

There is a second version of the function with an extra argument to copy just the stencil, or just the depth of a packed depth-stencil format.

Vulkan

typedef struct VkImageSubresourceLayers {
    VkImageAspectFlags    aspectMask;
    uint32_t              mipLevel;
    uint32_t              baseArrayLayer;
    uint32_t              layerCount;
} VkImageSubresourceLayers;

typedef struct VkBufferImageCopy {
    VkDeviceSize                bufferOffset;
    uint32_t                    bufferRowLength;
    uint32_t                    bufferImageHeight;
    VkImageSubresourceLayers    imageSubresource;
    VkOffset3D                  imageOffset;
    VkExtent3D                  imageExtent;
} VkBufferImageCopy;

void vkCmdCopyBufferToImage(
    VkCommandBuffer                             commandBuffer,
    VkBuffer                                    srcBuffer,
    VkImage                                     dstImage,
    VkImageLayout                               dstImageLayout,
    uint32_t                                    regionCount,
    const VkBufferImageCopy*                    pRegions);

The only difference with Metal is that multiple copies can be done at the same time, and that multiple array slices can be done in one copy. It also supports multi-planar formats contrary to other APIs. Multisampled textures cannot be copied.

On transfer-only queues (no-compute, no-render), there is a minImageTransferGranularity requirement that can be arbitrary.

bufferOffset must be aligned to 4. width/height must be multiple of the block size for compressed formats. aspectMask must contain a single bit (no depth-stencil copies).

Copies between textures

D3D12

Same as buffer to texture.

Metal

// On MTLBlitCommandEncoder
func copy(
    from: MTLTexture,
    sourceSlice: Int,
    sourceLevel: Int,
    sourceOrigin: MTLOrigin,
    sourceSize: MTLSize,
    to: MTLTexture,
    destinationSlice: Int,
    destinationLevel: Int,
    destinationOrigin: MTLOrigin)

Vulkan

typedef struct VkImageCopy {
    VkImageSubresourceLayers    srcSubresource;
    VkOffset3D                  srcOffset;
    VkImageSubresourceLayers    dstSubresource;
    VkOffset3D                  dstOffset;
    VkExtent3D                  extent;
} VkImageCopy;

void vkCmdCopyImage(
    VkCommandBuffer                             commandBuffer,
    VkImage                                     srcImage,
    VkImageLayout                               srcImageLayout,
    VkImage                                     dstImage,
    VkImageLayout                               dstImageLayout,
    uint32_t                                    regionCount,
    const VkImageCopy*                          pRegions);

Proposal from copies involving textures

We can see that all APIs do copies between to view of GPU memory as textures, either from a texture, or a buffer viewed as a linear texture. This means we could phrase WebGPU this way too:

dictionnary WebGPUOrigin3D {
    u32 x;
    u32 y;
    u32 z;
};

dictionnary WebGPUExtent3D {
    u32 width;
    u32 height;
    u32 depth;
};

dictionary WebGPUBufferCopyView {
    WebGPUBuffer buffer;
    u32 offset;
    u32 rowPitch;
    u32 imageHeight;
}

dictionary WebGPUTextureCopyView {
    WebGPUTexture texture;
    u32 level;
    u32 slice;
    WebGPUOrigin3D origin;
    WebGPUTextureAspect aspect;
}

partial interface WebGPUCopyCommandRecordingThingy {
    void copyBufferToTexture(
        WebGPUBufferCopyView source,
        WebGPUTextureCopyView destination,
        WebGPUExtent3D copySize);

    void copyTextureToBuffer(
        WebGPUTextureCopyView source,
        WebGPUBufferCopyView destination,
        WebGPUExtent3D copySize);

    void copyTextureToTexture(
        WebGPUTextureCopyView source,
        WebGPUTextureCopyView destination,
        WebGPUExtent3D copySize);
}

Validation rules would include stuff like alignement for compressed formats, alignement for the buffer copy view members (for pixel format size, and 256 rowPitch for D3D12), aspects matching if needed, etc.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions