这是indexloc提供的服务,不要输入任何密码
Skip to content

Investigation: Sparse Resources #455

@litherum

Description

@litherum

Sparse resources are a way of using the virtual memory system on the GPU. It’s possible to create a resource that appears to be larger than physical memory, but only has a small portion of that resource actually backed by physical memory. These memory mappings are done at the tile level, where a single miplevel of a texture can be split into a collection of tiles.

Motivation

The benefit is that regions of textures that are unused don’t actually have to be mapped to physical memory, which means they don’t count toward texture memory budgets. This means that memory use is decreased for rendering at a particular quality level, or alternatively, quality is increased when rendering at a particular memory budget.

The benefit occurs when textures are large and include multiple objects (e.g. a texture atlas) but not all of those objects are rendered at the same miplevel. Placing each object into its own texture is unfortunate because resource tracking is performed for every resource, so the overhead associated with many tiny resources can be quite high. On the other hand, having a few giant resources without sparse resources leads to wasted memory, where entire miplevels of huge textures must all be resident. Sparse textures provide a nice middle ground, where resources can be large in virtual memory, but small in physical memory.

In order to characterize this, I used the Modern Rendering with Metal sample code from Apple. There is a simple switch to enable / disable sparse textures. Measuring the memory use shows:

Screen Shot 2019-10-01 at 10 09 47 PM

On this particular sample, using sparse textures results in a 15% reduction of texture memory usage. This indicates that this feature is worth pursuing.

D3D12

There are 4 tiled resource tiers on D3D12. Tier 2 and above requires that unmapped reads return 0 and unmapped writes do nothing. Applications can call ID3D12Device::CreateReservedResource() to create an unmapped resource. To actually make an allocation, an application can call ID3D12Device::CreateHeap().

The way an application associates a resource with memory from a heap is to call ID3D12CommandQueue::UpdateTileMappings(). This call lets an author specify any particular tile region in the resource should be mapped or unmapped to any particular region of a particular heap. It’s possible to have a single resource with mappings that come from distinct heaps, and it’s possible to have a single page in a heap that is backing multiple resources.

The application can ask for information like the tile size and the total number of tiles in a resource by calling ID3D12Device::GetResourceTiling(). An application also can determine how much space it should allocate by calling a collection of functions like ID3D12Device::GetCopyableFootprints(), ID3D12Device::GetResourceAllocationInfo(), and IDXGIAdapter3::QueryVideoMemoryInfo().

There are additional functions in Shader Model 5 to assist in determining which tiles are necessary to render a scene. Sampling returns an additional optional output value which can be fed to CheckAccessFullyMapped() to determine if any of the samples in that sample operation were unmapped. There’s also a LOD clamp parameter so an application can restrict its sampling to occur from a LOD that is guaranteed to be mapped. The application can then perform its own accounting to determine which unmapped tiles have the most samples and are most important to load.

Vulkan

VkPhysicalDeviceFeatures() advertises support for 9 different sparse residency flags. To make a resource that can be sparse, applications can set the VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT flag inside vkCreateImageView(). To make a memory allocation, applications can call vkAllocateMemory() (just like every other allocation).

The way an application associates a resource with an allocation is to call vkQueueBindSparse(). This call lets an author specify any particular tile region in the resource should be mapped to any particular region of a memory allocation.

There are tons of different restrictions and configuration parameters an application needs to abide by when using sparse resources. Things like VkSparseImageFormatProperties, VkPhysicalDeviceSparseProperties, vkGetPhysicalDeviceSparseImageFormatProperties(), vkGetImageSparseMemoryRequirements(), and vkGetImageMemoryRequirements(). It’s all very complicated and I couldn’t really understand all the complexity.

There isn’t really a good story regarding how an application knows how much size to allocate. vkGetImageSparseMemoryRequirements() and vkGetImageMemoryRequirements() get you part of the way there, but they don’t tell you the current memory pressure. VK_EXT_memory_budget adds functions which tell you this, but the extension is only present on 30% Windows, 32% Linux, and 0% on Android.

For SPIR-V, there are a collection of OpImageSparse* commands, which returns a residency code in addition to the results of the operation. This residency code can be fed to OpImageSparseTexelsResident to determine whether or not the texels are all resident. These functions require the SparseResidency capability. There’s also a LOD clamp parameter so an application can restrict its sampling to occur from a LOD that is guaranteed to be mapped. The application can then perform its own accounting to determine which unmapped tiles have the most samples and are most important to load.

Metal

Metal is simpler than the other two APIs. To detect whether sparse textures are available, applications can call MTLDevice.supportsFamily(MTLGPUFamily.familyApple6). Applications first make an allocation by creating a MTLHeap with the type set to .sparse. Then, they can create a sparse resource associated with that heap by calling MTLHeap.makeTexture() on that heap. Multiple textures can be associated with a single heap.

Metal added a new type of Encoder which governs the mapping between physical and virtual memory: MTLResourceStateCommandEncoder. This function only lets you map and unmap a particular region of a resource. You can’t specify which part of the heap is supposed to back the texture. You can’t specify that the same physical page gets mapped to two distinct textures. You can’t specify that a resource should be backed by memory from two different heaps.

Similar to the other APIs, there are a collection of restrictions which indicate to the application how they are expected to use the mapping: MTLDevice.sparseTileSizeInBytes, MTLDevice.sparseTileSize(), MTLDevice.convertSparsePixelRegions(), MTLDevice.convertSparseTileRegions(), and MTLTexture.firstMipmapInTail.

An application knows how big to make their allocation by calling functions on the MTLDevice: MTLDevice.heapTextureSizeAndAlign(), MTLDevice.recommendedMaxWorkingSetSize, MTLDevice.currentAllocatedSize.

Metal has a great feature called Texture Access Counters which automatically count how often tiles from textures are accessed. An application can get access to these counters by calling MTLBlitCommandEncoder.getTextureAccessCounters(). This means they don’t have to do their own bookkeeping that they would have to do in the other two APIs.

Metal Shading Language includes new functions like sparse_sample() which returns additional information to let you know whether the samples were mapped. There’s also a LOD clamp parameter so an application can restrict its sampling to occur from a LOD that is guaranteed to be mapped.

OpenGL (just for fun)

It’s governed by two extensions: GL_ARB_sparse_texture (40%) and GL_ARB_sparse_buffer (45%). In ES, it's governed by GL_EXT_sparse_texture (1%).

Conclusion

There’s a lot of complexity here. Because Vulkan’s feature support so complicated, we would have to figure out which support we can add that is a good balance of ubiquity and usefulness.

2 of the 3 APIs build sparse textures on top of heaps, but WebGPU has no concept of heaps. We would probably have to add support for heaps in order to get sparse resource functionality.

Metal’s has two requirements: the mappings for a resource must only come from a single heap, and a single tile can’t be mapped to two resources. These requirements will have to be incorporated into whatever we do here.

We’ll also have to figure out how it interacts with compressed textures, if at all.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions