这是indexloc提供的服务,不要输入任何密码
Skip to content

Multi-Queue Investigation #1065

@kvark

Description

@kvark

Intro

(edits: list of queues specified at creation)

Using multiple queues in low-level API is a good way to make sure the compute units are always busy with useful work. Most popular use case is "async compute" where, in addition to the main queue, there are 1-2 compute-only queues crunching up the data, some of which may be needed on the main queue.

Some links:

Using multiple-queues is not mandatory to get the job done, it's purely an optimization, that allows more efficient use of hardware. However, it's important for WebGPU to get this right as it may affect the synchronization design in general. We want to at least be sure that multi-queue support can be added without changing the API.

Therefore, this investigation is focused on the synchronization aspect, and not the surrounding logic of queue discovery and utilization. Related to #478

Vulkan

The available queues are discovered via the physical device, and need to be requested at logical device creation. Vulkan exposes multiple families of queues, each having a set of capabilities (like the ability to do compute, graphics, or transfer operations), and one or more logical queues.

If a resource can be used by a queue family, it can be used by any of the queues in this family, without any more explicit synchronization than just regular semaphores.

As for using the resource by different queue families, Vulkan has the sharing mode, which has to be specified at resource creation.

Exclusive:
Only one queue family can access that resource at any given time.

In order to use it on a different queue family, a "transfer" operation needs to be encoded in command streams on both queues:

  • the old queue needs a "release" pipeline barrier, but only if the contents of the resource need to be preserved. If the resource is cleared right away on the new queue, this barrier can be omitted.
  • the new queue needs an "acquire" pipeline barrier
  • submissions for these commands have to be synchronized by a semaphore

Concurrent:
Any queue family can access the resource. A resource has to specify, at creation, the list of queue families that will be able to access it.

In addition to making the "transfer" semantics implicit, it also unlocks a case where a resource is used (for reading) simultaneously on multiple queue families.

Having the concurrent sharing mode comes with performance implications: drivers have to disable color compression for textures, for example.

D3D12

A device can spawn queues, as many as needed. Each resource can be either mutably accessed on a single queue, or simultaneously accessed for reading on multiple queues, at a given time. Queues can be synchronized with each other with fences (which are analogous to Vulkan semaphores, but more powerful). This, so far, looks like the "concurrent" mode of Vulkan.

Copy "engines" (which is D3D12's second name for queues) are defined as a separate "class". So resource states COPY_DEST and COPY_SOURCE aren't observed by all queues, but instead considered separate by the copy and non-copy queues. We can see it as a need to do the "ownership transition" (like with Vulkan's exclusive sharing mode). However, in D3D12 it's not necessary to do a "release" transition, given the implicit state decay rules (if I understand correctly), thus it's simpler to implement (but not optional, like in Vulkan).

Metal

(I know least about this one, section is to be edited!)

In Metal-1, it was possible to create many queues, but there was no way to synchronize access between them. Different queues were meant to do work that is totally independent.

In later Metal (citation needed), MTLEvent was added, and it can synchronize between queues of the same device (just like VkSemaphore or ID3D11Fence).

I wasn't able to find concrete information on whether it's valid to use the same resource by multiple queues, simultaneously, and under which conditions.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions