Investigation: Sparse Resources

Sparse resources are a way of using the virtual memory system on the GPU. It’s possible to create a resource that appears to be larger than physical memory, but only has a small portion of that resource actually backed by physical memory. These memory mappings are done at the tile level, where a single miplevel of a texture can be split into a collection of tiles.

# Motivation

The benefit is that regions of textures that are unused don’t actually have to be mapped to physical memory, which means they don’t count toward texture memory budgets. This means that memory use is decreased for rendering at a particular quality level, or alternatively, quality is increased when rendering at a particular memory budget.

The benefit occurs when textures are large and include multiple objects (e.g. a texture atlas) but not all of those objects are rendered at the same miplevel. Placing each object into its own texture is unfortunate because resource tracking is performed for every resource, so the overhead associated with many tiny resources can be quite high. On the other hand, having a few giant resources without sparse resources leads to wasted memory, where entire miplevels of huge textures must all be resident. Sparse textures provide a nice middle ground, where resources can be large in virtual memory, but small in physical memory.

In order to characterize this, I used the Modern Rendering with Metal [sample code](https://developer.apple.com/documentation/metal/modern_rendering_with_metal?language=objc) from Apple. There is a simple switch to enable / disable sparse textures. Measuring the memory use shows:

![Screen Shot 2019-10-01 at 10 09 47 PM](https://user-images.githubusercontent.com/918903/66057577-d1d09280-e4ed-11e9-863d-1844fdc7c3b8.png)

On this particular sample, using sparse textures results in a 15% reduction of texture memory usage. This indicates that this feature is worth pursuing.

# D3D12

There are 4 tiled resource tiers on D3D12. Tier 2 and above requires that unmapped reads return 0 and unmapped writes do nothing. Applications can call `ID3D12Device::CreateReservedResource()` to create an unmapped resource. To actually make an allocation, an application can call `ID3D12Device::CreateHeap()`.

The way an application associates a resource with memory from a heap is to call `ID3D12CommandQueue::UpdateTileMappings()`. This call lets an author specify any particular tile region in the resource should be mapped or unmapped to any particular region of a particular heap. It’s possible to have a single resource with mappings that come from distinct heaps, and it’s possible to have a single page in a heap that is backing multiple resources.

The application can ask for information like the tile size and the total number of tiles in a resource by calling `ID3D12Device::GetResourceTiling()`. An application also can determine how much space it should allocate by calling a collection of functions like `ID3D12Device::GetCopyableFootprints()`, `ID3D12Device::GetResourceAllocationInfo()`, and `IDXGIAdapter3::QueryVideoMemoryInfo()`.

There are additional functions in Shader Model 5 to assist in determining which tiles are necessary to render a scene. Sampling returns an additional optional output value which can be fed to `CheckAccessFullyMapped()` to determine if any of the samples in that sample operation were unmapped. There’s also a LOD clamp parameter so an application can restrict its sampling to occur from a LOD that is guaranteed to be mapped. The application can then perform its own accounting to determine which unmapped tiles have the most samples and are most important to load.

# Vulkan

`VkPhysicalDeviceFeatures()` advertises support for 9 different sparse residency flags. To make a resource that can be sparse, applications can set the `VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT` flag inside `vkCreateImageView()`. To make a memory allocation, applications can call `vkAllocateMemory()` (just like every other allocation).

The way an application associates a resource with an allocation is to call `vkQueueBindSparse()`. This call lets an author specify any particular tile region in the resource should be mapped to any particular region of a memory allocation.

There are tons of different restrictions and configuration parameters an application needs to abide by when using sparse resources. Things like `VkSparseImageFormatProperties`, `VkPhysicalDeviceSparseProperties`, `vkGetPhysicalDeviceSparseImageFormatProperties()`, `vkGetImageSparseMemoryRequirements()`, and `vkGetImageMemoryRequirements()`. It’s all very complicated and I couldn’t really understand all the complexity.

There isn’t really a good story regarding how an application knows how much size to allocate. `vkGetImageSparseMemoryRequirements()` and `vkGetImageMemoryRequirements()` get you part of the way there, but they don’t tell you the current memory pressure. VK_EXT_memory_budget adds functions which tell you this, but the extension is only present on 30% Windows, 32% Linux, and 0% on Android.

For SPIR-V, there are a collection of `OpImageSparse*` commands, which returns a residency code in addition to the results of the operation. This residency code can be fed to `OpImageSparseTexelsResident` to determine whether or not the texels are all resident. These functions require the `SparseResidency` capability. There’s also a LOD clamp parameter so an application can restrict its sampling to occur from a LOD that is guaranteed to be mapped. The application can then perform its own accounting to determine which unmapped tiles have the most samples and are most important to load.

# Metal

Metal is simpler than the other two APIs. To detect whether sparse textures are available, applications can call `MTLDevice.supportsFamily(MTLGPUFamily.familyApple6)`. Applications first make an allocation by creating a `MTLHeap` with the type set to `.sparse`. Then, they can create a sparse resource associated with that heap by calling `MTLHeap.makeTexture()` on that heap. Multiple textures can be associated with a single heap.

Metal added a new type of Encoder which governs the mapping between physical and virtual memory: `MTLResourceStateCommandEncoder`. This function only lets you map and unmap a particular region of a resource. You can’t specify which part of the heap is supposed to back the texture. You can’t specify that the same physical page gets mapped to two distinct textures. You can’t specify that a resource should be backed by memory from two different heaps.

Similar to the other APIs, there are a collection of restrictions which indicate to the application how they are expected to use the mapping: `MTLDevice.sparseTileSizeInBytes`, `MTLDevice.sparseTileSize()`, `MTLDevice.convertSparsePixelRegions()`, `MTLDevice.convertSparseTileRegions()`, and `MTLTexture.firstMipmapInTail`.

An application knows how big to make their allocation by calling functions on the `MTLDevice`: `MTLDevice.heapTextureSizeAndAlign()`, `MTLDevice.recommendedMaxWorkingSetSize`, `MTLDevice.currentAllocatedSize`.

Metal has a great feature called Texture Access Counters which automatically count how often tiles from textures are accessed. An application can get access to these counters by calling `MTLBlitCommandEncoder.getTextureAccessCounters()`. This means they don’t have to do their own bookkeeping that they would have to do in the other two APIs.

Metal Shading Language includes new functions like `sparse_sample()` which returns additional information to let you know whether the samples were mapped. There’s also a LOD clamp parameter so an application can restrict its sampling to occur from a LOD that is guaranteed to be mapped.

# OpenGL (just for fun)

It’s governed by two extensions: GL_ARB_sparse_texture (40%) and GL_ARB_sparse_buffer (45%). In ES, it's governed by GL_EXT_sparse_texture (1%).

# Conclusion

There’s a lot of complexity here. Because Vulkan’s feature support so complicated, we would have to figure out which support we can add that is a good balance of ubiquity and usefulness.

2 of the 3 APIs build sparse textures on top of heaps, but WebGPU has no concept of heaps. We would probably have to add support for heaps in order to get sparse resource functionality.

Metal’s has two requirements: the mappings for a resource must only come from a single heap, and a single tile can’t be mapped to two resources. These requirements will have to be incorporated into whatever we do here.

We’ll also have to figure out how it interacts with compressed textures, if at all.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigation: Sparse Resources #455

Motivation

D3D12

Vulkan

Metal

OpenGL (just for fun)

Conclusion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigation: Sparse Resources #455

Description

Motivation

D3D12

Vulkan

Metal

OpenGL (just for fun)

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions