这是indexloc提供的服务,不要输入任何密码
Skip to content

"Either" attachment load and store ops #97

@Kangz

Description

@Kangz

Context

Every API with render targets has at least the following load ops:

  • Load, where the initial value of the pixel in the renderpass will be loaded from memory
  • Clear, where they will be set to a constant value
  • Don't care, where they can be any value and the app will overwrite the value anyway

Likewise every API has at least the following store ops:

  • Store, where the value is written to memory
  • Discard, where the stored value is undefined (and can just not be written)
  • Resolve, for multisample attachments (although it is part of the subpass description in Vulkan)

"Clear", "Don't care" and "Discard" are extremely important on tiler GPUs because they allow skipping memory operations are the beginning and end of render passes and save a lot of power on mobile platforms.

(side note: Vulkan resolve attachments are part of pipeline compatibility, this will be mildly annoying)

Proposal 1

"Don't care" is undefined value, which we don't want so WebGPU could expose only "Clear" which on mobile GPUs usually happen inside the tile memory and is super cheap.

"Discard" is undefined value too but "Store" is really expensive. Instead we could have a "StoreZero" which acts as if it stores zeroes in the memory, but actually discards and clears lazily the next time the texture is used. If the next time the texture is used as loadOp "Clear" with zeroes, then that counts as a lazily clear and is cheap!

So the way to make transient attachments efficient would be to loadOp "Clear" with zeroes and "StoreZero" at the end.

The issue with proposal 1

Proposal 1 would work great on tilers who can clear in the tile cache, but what about immediate-mode GPUs or tiler GPUs without an explicit tile cache? For these "Clear" zero and "StoreZero" could cause the following to happen:

  • The texture is clear to zero outside the render pass
  • The render pass starts, loading pixel values (no tile-cache clearing)
  • The render pass ends, storing pixel values (no discard)

This would be even worse than "Load" and "Store" because an additional clear happens.

Proposal 2

Have the following load ops:

  • Load
  • Clear
  • LoadOrClearZero where all pixel values will start at zero, or all pixel values will start at the previously stored value.

Have the following store ops:

  • Store
  • StoreZero
  • StoreOrStoreZero where all pixels get zero stored (lazily) or all pixels get stored the final value.

Using LoadOrClearZero and StoreOrStoreZero will allow the WebGPU implementation to choose the optimal load and store ops based on the underlying hardware without exposing them but few differences for the application.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions