-
Notifications
You must be signed in to change notification settings - Fork 410
GraphicsMemory
DirectXTK |
---|
This class is used to manage video memory allocations for constants, dynamic vertex buffers, dynamic index buffers, uploading data to the GPU, etc.
Related tutorial: Adding the DirectX Tool Kit
#include <GraphicsMemory.h>
The graphics memory helper is a singleton. It needs explicit initialization because it requires the device.
std::unique_ptr<GraphicsMemory> graphicsMemory;
graphicsMemory = std::make_unique<GraphicsMemory>(device);
For exception safety, it is recommended you make use of the C++ RAII pattern and use a std::unique_ptr
.
The graphics memory helper manages memory allocation for 'in-flight' data shared between the CPU and GPU. After each frame is rendered, the application needs to call Present
and then Commit to let the manager know that a frame's worth of video memory has been sent to the GPU. This allows the manager to check to see if a previous frame's data is complete and can be released.
swapChain->Present(...);
graphicsMemory->Commit(m_deviceResources->GetCommandQueue());
GraphicsMemory
used by the other components in the library, but it can be used directly for allocating video memory via Allocate and AllocateConstant.
The GraphicsResource
class is a smart-pointer with std::unique_ptr
semantics that manages the GPU address, CPU address, and fencing for a particular memory allocation. The SharedGraphicsResource
class is a similar smart-pointer with std::shared_ptr
semantics that's typically used for shared 'static' vertex buffers / index buffers.
Here is an example of using GraphicsMemory
to allocate and render with a constant buffer. Here we use a template version of AllocateConstant
that takes a structure of data to use for the size and source data to copy into the newly allocated graphics memory.
__declspec(align(16)) struct ConstantBufferParams
{
XMVECTOR lightDir;
};
static_assert((sizeof(ConstantBufferParams) % 16) == 0, "CB size not padded correctly");
...
ConstantBufferParams cbData;
XMStoreFloat3(&cbData.lightDir, lightDir);
GraphicsResource myCB = graphicsMemory->AllocateConstant(cbData);
...
commandList->SetComputeRootConstantBufferView(ROOT_SIGNATURE_CB_INDEX, myCB.GpuAddress());
Note that for constant buffers in particular, you generally allocate one buffer per frame since they change and you need to maintain several in flight depending on the number of render targets in your swap chain. GraphicsMemory uses reference-counting and GPU fences to track the lifetime of these buffers to clean them up once all uses have completed.
You can use GraphicsMemory to allocate VB/IB for direct rendering, but the memory is in an 'upload' heap so you'll get comparable performance to DirectX 11's USAGE_DYNAMIC
.
GraphicsResource vertexBuffer;
...
vertexBuffer = graphicsMemory->Allocate(sizeof(Vertex) * c_number_of_verts);
memcpy(vertexBuffer.Memory(), s_vertices, sizeof(Vertex) * c_number_of_verts);
...
D3D12_VERTEX_BUFFER_VIEW vbv;
vbv.BufferLocation = vertexBuffer.GpuAddress();
vbv.StrideInBytes = sizeof(Vertex);
vbv.SizeInBytes = static_cast<UINT>(vertexBuffer.Size());
commandList->IASetVertexBuffers(0, 1, &vbv);
Alternatively, you can use GraphicsMemory to allocate and fill in the upload data, then use ResourceUploadBatch to schedule the upload to USAGE_DEFAULT
memory to get the performance benefits of 'static' VBs/IBs.
SharedGraphicsResource vertexBuffer;
ComPtr<ID3D12Resource> staticVertexBuffer;
...
vertexBuffer = graphicsMemory->Allocate(sizeof(Vertex) * c_number_of_verts);
memcpy(vertexBuffer.Memory(), s_vertices, sizeof(Vertex) * c_number_of_verts);
CD3DX12_HEAP_PROPERTIES heapProperties(D3D12_HEAP_TYPE_DEFAULT);
auto desc = CD3DX12_RESOURCE_DESC::Buffer(vertexBuffer.Size());
DX::ThrowIfFailed(device->CreateCommittedResource(
&heapProperties,
D3D12_HEAP_FLAG_NONE,
&desc,
D3D12_RESOURCE_STATE_COPY_DEST,
nullptr,
IID_PPV_ARGS(staticVertexBuffer.GetAddressOf())
));
resourceUploadBatch.Upload(staticVertexBuffer.Get(), vertexBuffer);
resourceUploadBatch.Transition(staticVertexBuffer.Get(),
D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER);
// release the upload reference which is kept alive by the resource upload batch until complete
vertexBuffer.Reset();
...
D3D12_VERTEX_BUFFER_VIEW vbv;
vbv.BufferLocation = staticVertexBuffer->GetGPUVirtualAddress();
vbv.StrideInBytes = sizeof(Vertex);
vbv.SizeInBytes = sizeof(Vertex) * c_number_of_verts;
commandList->IASetVertexBuffers(0, 1, &vbv);
GeometricPrimitive and Model both allocate VBs/IBs from GraphicsMemory as upload heap content, and when you call the optional
LoadStaticBuffers
method they upload the data to static buffers and use those for more efficient rendering.
Allocated buffers have a lifetime determined by a GraphicsResource
smart-pointer referencing it, or a reference count managed by one or more SharedGraphicsResource
smart-pointers. You can clear a reference by either assigning a different value to the same smart-pointer, or call Reset.
Generally memory is reclaimed on subsequent calls to Commit
(i.e. once per frame), but an application can explicitly call GarbageCollect as well to force a cleanup (typically switching between game levels). If you idle the GPU, the maximum amount of memory will be reclaimed.
m_deviceResources->WaitForGpu();
graphicsMemory->GarbageCollect();
Since GraphicsMemory is a singleton, you can make use of the static method Get if desired:
GraphicsMemory::Get().GarbageCollect()
With DirectX 12, multi-GPU is implemented explicitly through multiple devices. See Micrsoft Docs. DirectX Tool Kit supports creating one GraphicsMemory
singleton per device for mGPU scenarios. Each instance hosts GPU-accessible memory on a specific device.
Real-time data about the memory usage of this object is provided by GetStatistics
.
auto stats = graphicsMemory->GetStatistics();
wchar_t buff[256] = {};
swprintf_s(buff, L"GraphicsMemory: committed %zu KB, total %zu KB (%zu pages)\n"
" peak commited %zu KB, peak total %zu KB (%zu pages)\n",
stats.committedMemory / 1024,
stats.totalMemory / 1024,
stats.totalPages,
stats.peakCommitedMemory / 1024,
stats.peakTotalMemory / 1024,
stats.peakTotalPages);
The 'peak' values are updated each time you call GetStatistics
so you want to call this once per frame, typically after the Commit
call as part of your presentation loop, if you want accurate statistics. You can clear the peak values using ResetStatistics
.
Allocation of memory resources is fully asynchronous, but should be sync'd once-per-frame to ensure the proper Commit
semantics.
The memory managed by GraphicsMemory is a D3D12_HEAP_TYPE_UPLOAD
heap, and is allocated with D3D12_HEAP_FLAG_NONE
. Each allocation is created with D3D12_RESOURCE_FLAG_NONE
. This means it works well for many resource types, but not for Render Targets, Depth/Stencil buffers, or Unordered Access resources.
The Allocate method takes an alignment optional parameter that defaults to 16-byte (paragraph) alignment. Graphics memory allocations should always use 4-byte (DWORD) alignment or greater.
AllocateConstant will always use an alignment of D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT
.
All content and source code for this package are subject to the terms of the MIT License.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
- Universal Windows Platform apps
- Windows desktop apps
- Windows 11
- Windows 10
- Xbox One
- Xbox Series X|S
- x86
- x64
- ARM64
- Visual Studio 2022
- Visual Studio 2019 (16.11)
- clang/LLVM v12 - v18
- MinGW 12.2, 13.2
- CMake 3.20