这是indexloc提供的服务,不要输入任何密码
Skip to content

Picking a good value for threads-per-threadgroup #275

@litherum

Description

@litherum

Different hardware running the same algorithm may need dramatically different threads-per-threadgroup values in order to get good performance. A small device may need a small value so the register file doesn't spill to main memory, but a big device may need a large value to take advantage of all its multiprocessing lanes. However, an author doesn't know which value is a good value to pick.

This problem is not specific to WebGPU, but we do have a goal of portable performance.

It's kind of doubly bad because the current design of WebGPU bakes in these values as literals into the shader. Therefore, even if the application could divine good values at runtime, it would have to rewrite its shader to use them.

Similarly, each implementation/device has its own thresholds of acceptable values, and these thresholds are specific to each shader. Therefore, compiling a compute shader on one platform may succeed, but on another platform may fail because it ran up against an implementation-specific limit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiWebGPU API

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions