Skip to content

GraphicsMemory

Chuck Walbourn edited this page Sep 15, 2022 · 42 revisions
DirectXTK

This class is used to manage video memory allocations for constants, dynamic vertex buffers, dynamic index buffers, uploading data to the GPU, etc.

Related tutorial: Adding the DirectX Tool Kit

Header

#include <GraphicsMemory.h>

Initialization

The graphics memory helper is a singleton. It needs explicit initialization because it requires the device.

std::unique_ptr<GraphicsMemory> graphicsMemory;
graphicsMemory = std::make_unique<GraphicsMemory>(device);

For exception safety, it is recommended you make use of the C++ RAII pattern and use a std::unique_ptr.

Present

The graphics memory helper manages memory allocation for 'in-flight' data shared between the CPU and GPU. After each frame is rendered, the application needs to call Present and then Commit to let the manager know that a frame's worth of video memory has been sent to the GPU. This allows the manager to check to see if a previous frame's data is complete and can be released.

swapChain->Present(...);

graphicsMemory->Commit(m_deviceResources->GetCommandQueue());

Usage

GraphicsMemory used by the other components in the library, but it can be used directly for allocating video memory via Allocate and AllocateConstant.

The GraphicsResource class is a smart-pointer with std::unique_ptr semantics that manages the GPU address, CPU address, and fencing for a particular memory allocation. The SharedGraphicsResource class is a similar smart-pointer with std::shared_ptr semantics that's typically used for shared 'static' vertex buffers / index buffers.

Constant Buffer

Here is an example of using GraphicsMemory to allocate and render with a constant buffer. Here we use a template version of AllocateConstant that takes a structure of data to use for the size and source data to copy into the newly allocated graphics memory.

__declspec(align(16)) struct ConstantBufferParams
{
    XMVECTOR lightDir;
};

static_assert((sizeof(ConstantBufferParams) % 16) == 0, "CB size not padded correctly");

...

ConstantBufferParams cbData;
XMStoreFloat3(&cbData.lightDir, lightDir);
GraphicsResource myCB = graphicsMemory->AllocateConstant(cbData);

...

commandList->SetComputeRootConstantBufferView(ROOT_SIGNATURE_CB_INDEX, myCB.GpuAddress());

Note that for constant buffers in particular, you generally allocate one buffer per frame since they change and you need to maintain several in flight depending on the number of render targets in your swap chain. GraphicsMemory uses reference-counting and GPU fences to track the lifetime of these buffers to clean them up once all uses have completed.

Vertex/Index Buffers

You can use GraphicsMemory to allocate VB/IB for direct rendering, but the memory is in an 'upload' heap so you'll get comparable performance to DirectX 11's USAGE_DYNAMIC.

GraphicsResource vertexBuffer;

...

vertexBuffer = graphicsMemory->Allocate(sizeof(Vertex) * c_number_of_verts);
memcpy(vertexBuffer.Memory(), s_vertices, sizeof(Vertex) * c_number_of_verts);

...

D3D12_VERTEX_BUFFER_VIEW vbv;
vbv.BufferLocation = vertexBuffer.GpuAddress();
vbv.StrideInBytes = sizeof(Vertex);
vbv.SizeInBytes = static_cast<UINT>(vertexBuffer.Size());
commandList->IASetVertexBuffers(0, 1, &vbv);

Alternatively, you can use GraphicsMemory to allocate and fill in the upload data, then use ResourceUploadBatch to schedule the upload to USAGE_DEFAULT memory to get the performance benefits of 'static' VBs/IBs.

SharedGraphicsResource vertexBuffer;
ComPtr<ID3D12Resource> staticVertexBuffer;

...

vertexBuffer = graphicsMemory->Allocate(sizeof(Vertex) * c_number_of_verts);
memcpy(vertexBuffer.Memory(), s_vertices, sizeof(Vertex) * c_number_of_verts);

CD3DX12_HEAP_PROPERTIES heapProperties(D3D12_HEAP_TYPE_DEFAULT);
auto desc = CD3DX12_RESOURCE_DESC::Buffer(vertexBuffer.Size());
DX::ThrowIfFailed(device->CreateCommittedResource(
    &heapProperties,
    D3D12_HEAP_FLAG_NONE,
    &desc,
    D3D12_RESOURCE_STATE_COPY_DEST,
    nullptr,
    IID_PPV_ARGS(staticVertexBuffer.GetAddressOf())
    ));

resourceUploadBatch.Upload(staticVertexBuffer.Get(), vertexBuffer);

resourceUploadBatch.Transition(staticVertexBuffer.Get(),
    D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER);

// release the upload reference which is kept alive by the resource upload batch until complete
vertexBuffer.Reset();

...

D3D12_VERTEX_BUFFER_VIEW vbv;
vbv.BufferLocation = staticVertexBuffer->GetGPUVirtualAddress();
vbv.StrideInBytes = sizeof(Vertex);
vbv.SizeInBytes = sizeof(Vertex) * c_number_of_verts;
commandList->IASetVertexBuffers(0, 1, &vbv);

GeometricPrimitive and Model both allocate VBs/IBs from GraphicsMemory as upload heap content, and when you call the optional LoadStaticBuffers method they upload the data to static buffers and use those for more efficient rendering.

Garbage collection

Allocated buffers have a lifetime determined by a GraphicsResource smart-pointer referencing it, or a reference count managed by one or more SharedGraphicsResource smart-pointers. You can clear a reference by either assigning a different value to the same smart-pointer, or call Reset.

Generally memory is reclaimed on subsequent calls to Commit (i.e. once per frame), but an application can explicitly call GarbageCollect as well to force a cleanup (typically switching between game levels). If you idle the GPU, the maximum amount of memory will be reclaimed.

m_deviceResources->WaitForGpu();

graphicsMemory->GarbageCollect();

Since GraphicsMemory is a singleton, you can make use of the static method Get if desired: GraphicsMemory::Get().GarbageCollect()

Multi-GPU

With DirectX 12, multi-GPU is implemented explicitly through multiple devices. See Micrsoft Docs. DirectX Tool Kit supports creating one GraphicsMemory singleton per device for mGPU scenarios. Each instance hosts GPU-accessible memory on a specific device.

Statistics

Real-time data about the memory usage of this object is provided by GetStatistics.

auto stats = graphicsMemory->GetStatistics();

wchar_t buff[256] = {};
swprintf_s(buff, L"GraphicsMemory: committed %zu KB, total %zu KB (%zu pages)\n"
                  "                peak commited %zu KB, peak total %zu KB (%zu pages)\n",
        stats.committedMemory / 1024,
        stats.totalMemory / 1024,
        stats.totalPages,
        stats.peakCommitedMemory / 1024,
        stats.peakTotalMemory / 1024,
        stats.peakTotalPages);

The 'peak' values are updated each time you call GetStatistics so you want to call this once per frame, typically after the Commit call as part of your presentation loop, if you want accurate statistics. You can clear the peak values using ResetStatistics.

Threading model

Allocation of memory resources is fully asynchronous, but should be sync'd once-per-frame to ensure the proper Commit semantics.

Remark

The memory managed by GraphicsMemory is a D3D12_HEAP_TYPE_UPLOAD heap, and is allocated with D3D12_HEAP_FLAG_NONE. Each allocation is created with D3D12_RESOURCE_FLAG_NONE. This means it works well for many resource types, but not for Render Targets, Depth/Stencil buffers, or Unordered Access resources.

The Allocate method takes an alignment optional parameter that defaults to 16-byte (paragraph) alignment. Graphics memory allocations should always use 4-byte (DWORD) alignment or greater.

AllocateConstant will always use an alignment of D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT.

Further reading

Memory Management

Memory Management in Direct3D 12

For Use

  • Universal Windows Platform apps
  • Windows desktop apps
  • Windows 11
  • Windows 10
  • Xbox One
  • Xbox Series X|S

Architecture

  • x86
  • x64
  • ARM64

For Development

  • Visual Studio 2022
  • Visual Studio 2019 (16.11)
  • clang/LLVM v12 - v18
  • MinGW 12.2, 13.2
  • CMake 3.20

Related Projects

DirectX Tool Kit for DirectX 11

DirectXMesh

DirectXTex

DirectXMath

Tools

Test Suite

Model Viewer

Content Exporter

DxCapsViewer

See also

DirectX Landing Page

Clone this wiki locally