In this tutorial, we'll render a simple colored triangle without WSI (Window System Integration).
Important
For simpler setup, we'll use VK_KHR_dynamic_rendering
device extension to skip the complex render pass and framebuffer preparation. Make sure your Vulkan driver supports the extension.
I'll explain each steps using Vulkan-Hpp RAII bindings first, then will explain which code can be replaced with vku provided. Note that this example has no WSI support yet, which will be covered in the next tutorial.
Here's the base code for the tutorial:
CMakeLists.txt
cmake_minimum_required(VERSION 3.30)
# Set experimental flag to enable `import std` support from CMake.
set(CMAKE_EXPERIMENTAL_CXX_IMPORT_STD "0e5b6991-d74f-4b3d-a41c-cf096e0b2508")
project(vku-tutorial LANGUAGES CXX)
set(CMAKE_CXX_STANDARD 23)
set(CMAKE_CXX_MODULE_STD 1)
# ----------------
# External dependencies.
# ----------------
find_package(vku CONFIG REQUIRED)
# ----------------
# Project executables.
# ----------------
add_executable(vku-tutorial main.cpp)
target_link_libraries(vku-tutorial PRIVATE vku::vku)
main.cpp
import std;
import vku;
int main() {
// Code starts from here...
}
vcpkg.json
{
"dependencies": [
"vku"
]
}
To use the Vulkan API, an instance have to be created. Also, for Vulkan-Hpp's RAII binding, vk::raii::Context
has to be created before the instance creation. We'll use Vulkan 1.0 to start from the scratch.
int main() {
const vk::raii::Context context;
constexpr vk::ApplicationInfo appInfo {
"Hello Triangle", // pApplicationName
0, // applicationVersion
nullptr, // pEngineName
0, // engineVersion
vk::makeApiVersion(0, 1, 0, 0),
};
constexpr std::array instanceLayers {
"VK_LAYER_KHRONOS_validation",
};
const vk::raii::Instance instance { context, vk::InstanceCreateInfo {
{}, // flags
&appInfo,
instanceLayers,
} };
}
We created a Vulkan instance with application name Hello Triangle. I won't explain the detailed meaning of the structure fields, as it's already explained in the Vulkan specification. If you're not familiar with these, I suggest you to use raw Vulkan API first. This library assumes you're already familiar with Vulkan.
If you're in macOS and running Vulkan with MoltenVK environment, you will likely get an error like this:
libc++abi: terminating due to uncaught exception of type vk::IncompatibleDriverError: Context::createInstance: ErrorIncompatibleDriver
Process finished with exit code 134 (interrupted by signal 6:SIGABRT)
Since MoltenVK is portable implementation of Vulkan based on Apple Metal, you have to specify some additional parameters. Here's the modified code:
int main() {
const vk::raii::Context context;
constexpr vk::ApplicationInfo appInfo {
"Hello Triangle", 0,
nullptr, 0,
vk::makeApiVersion(0, 1, 0, 0),
};
constexpr std::array instanceLayers {
"VK_LAYER_KHRONOS_validation",
};
#if __APPLE__
constexpr std::array instanceExtensions {
vk::KHRGetPhysicalDeviceProperties2ExtensionName,
vk::KHRPortabilityEnumerationExtensionName,
};
#endif
const vk::raii::Instance instance { context, vk::InstanceCreateInfo {
#if __APPLE__
vk::InstanceCreateFlagBits::eEnumeratePortabilityKHR, // flags
#else
{},
#endif
&appInfo,
instanceLayers, // pEnabledLayerNames
#if __APPLE__
instanceExtensions, // pEnabledExtensionNames
#endif
} };
}
Now running your application will create a Vulkan instance and return 0.
Let's talk about the current code's refactoring point. You may notice that you cannot "inline" the parameters into vk::InstanceCreateInfo
directly, like
const vk::raii::Instance { context, vk::InstanceCreateInfo {
{},
&vk::ApplicationInfo { ... }, // <- Can't take the address of a rvalue.
{ ... }, // <- vk::ArrayProxyNoTemporaries doesn't allow the construction from a rvalue.
{ ... }, // same as above.
} };
Because Vulkan-Hpp disallows vk::ArrayProxyNoTemporaries
construction from any rvalue. Since Vulkan structure is constructed for the future function calling parameter, its referencing data must be alive until the function calling end. The following code shows why allowing it is dangerous:
const vk::InstanceCreateInfo ci {
...,
{ extension1, extension2, ... },
}; // std::initializer_list of the extensions is destroyed at here.
const vk::raii::Instance instance { context, ci }; // ci refers the dangling.
However, back to the first code, is perfectly valid from the aspect of the object lifetime. Since vk::ApplicationInfo
struct and instance extensions initializer_list
are destroyed at the end of the expression, they will be alive while the function calling.
Anyway, Vulkan-Hpp chooses the safe side, and due to this decision we got several disadvantages:
- We have to name some trivial structs that are only used for the function calling. Naming things are hard!
- The flow of the code is reversed from the natural order of reading. If we want to know what parameters have been passed to the function calling, we have to find these parameter variables, which are declared before the function calling. If structures are nested (such like
vk::GraphicsPipelineCreateInfo
orvk::RenderPassCreateInfo
), it's much harder to understand the code.
For solution, vku provides the two functions, vku::unsafeAddress(const T&)
and vku::unsafeProxy(const <contiguous-range>&)
.
unsafeAddress
accepts a const-qualified value and return its address. The returned value is valid until the expression end.unsafeProxy
accepts a const-qualified lvalue of a contiguous range (std::array
,std::vector
,std::initializer_list
, ...) or a value and returnvk::ArrayProxyNoTemporaries
that can be used in Vulkan-Hpp. The returned proxy object is valid until the expression end.
With these functions, you can write the code like this:
const vk::raii::Context context;
const vk::raii::Instance instance { context, vk::InstanceCreateInfo {
#if __APPLE__
vk::InstanceCreateFlagBits::eEnumeratePortabilityKHR,
#else
{},
#endif
vku::unsafeAddress(vk::ApplicationInfo {
"Hello Triangle", 0,
nullptr, 0,
vk::makeApiVersion(0, 1, 0, 0),
}),
vku::unsafeProxy<const char*>("VK_LAYER_KHRONOS_validation"), // Specified the template type, unless it will be deduced as char array.
#if __APPLE__
vku::unsafeProxy({
vk::KHRGetPhysicalDeviceProperties2ExtensionName,
vk::KHRPortabilityEnumerationExtensionName,
}),
#endif
} };
Which got much clearer, and unnecessary variable naming is gone.
For this step, we'll get the physical device from instance
, and create a device with graphics queue. First, we have to check if there is a graphics operation cable physical device.
int main() {
...
const vk::raii::Instance instance { ... };
std::uint32_t graphicsQueueFamilyIndex;
const vk::raii::PhysicalDevice physicalDevice = [&] {
for (const vk::raii::PhysicalDevice &physicalDevice : instance.enumeratePhysicalDevices()) {
for (std::uint32_t queueFamilyIndex = 0; vk::QueueFamilyProperties properties : physicalDevice.getQueueFamilyProperties()) {
if (properties.queueFlags & vk::QueueFlagBits::eGraphics) {
graphicsQueueFamilyIndex = queueFamilyIndex;
return physicalDevice;
}
++queueFamilyIndex;
}
}
throw std::runtime_error { "No graphics operation supporting GPU." };
}();
}
For this tutorial, we only queried about the graphics queue family support for brevity, but this process is widely varying on the application's purpose, such like:
- Need dedicated compute/transfer queue.
- Need to check the surface support for WSI.
- Need to check some GPU specific feature support.
- Selecting the "best" GPU with the above requirements.
I'll cover how these goal can be achieved with vku in the next tutorials.
Next, creating a device with the graphics queue:
int main() {
...
const vk::raii::PhysicalDevice physicalDevice { ... };
const vk::raii::Device device { physicalDevice, vk::DeviceCreateInfo {
{}, // flags
vku::unsafeProxy(vk::DeviceQueueCreateInfo {
{}, // flags
graphicsQueueFamilyIndex,
vku::unsafeProxy(1.f), // queuePriorities
}), // queueCreateInfos
{}, // pEnabledLayerNames
#if __APPLE__
vku::unsafeProxy(vk::KHRPortabilitySubsetExtensionName), // pEnabledExtensionNames
#endif
} };
const vk::Queue graphicsQueue = (*device).getQueue(graphicsQueueFamilyIndex, /* queueIndex */ 0);
}
We created the graphics capable queue with priority 1.0
(full priority).
Wow, we've done a lot of things! The amount of code is not much unbearable, but it's quite verbose and troublesome. Let's see how vku can help you with this.
vku provides Gpu<QueueFamilies, Queues>
class for convenient and structured physical device selection and device creation. Here's the same featuring code with vku:
struct QueueFamilies {
std::uint32_t graphics;
explicit QueueFamilies(vk::PhysicalDevice physicalDevice)
: graphics { vku::getGraphicsQueueFamily(physicalDevice.getQueueFamilyProperties()).value() } { }
};
struct Queues {
vk::Queue graphics;
Queues(vk::Device device, const QueueFamilies &queueFamilies)
: graphics { device.getQueue(queueFamilies.graphics, 0) } { }
[[nodiscard]] static auto getCreateInfos(vk::PhysicalDevice, const QueueFamilies &queueFamilies) noexcept {
return vku::RefHolder {
[=](const float &priority) {
return vk::DeviceQueueCreateInfo {
{},
queueFamilies.graphics,
vk::ArrayProxyNoTemporaries(priority),
};
},
1.f,
};
}
};
int main() {
...
const vk::raii::Instance instance { ... };
// Note: vku::Gpu<QueueFamilies, Queues>::Config can be omitted. I'll leave it for the demonstration.
const vku::Gpu<QueueFamilies, Queues> gpu { instance, vku::Gpu<QueueFamilies, Queues>::Config {
#if __APPLE__
.deviceExtensions = {
vk::KHRPortabilitySubsetExtensionName,
},
#endif
} };
// gpu.physicalDevice (vk::raii::PhysicalDevice)
// gpu.queueFamilies (QueueFamilies)
// gpu.device (vk::raii::Device)
// gpu.queues (Queues)
// gpu.allocator (vma::Allocator)
}
All codes were gone! Parameters for physical device selection and device creation are in the vku::Gpu<QueueFamilies, Queues>::Config
struct, and initialized for typical usage. If you want, the settings can be modified using constructor. This struct has many parameters for sophisticated, fine-grained physical device selection, but I'll leave that details for later.
Here's the explanation of the code:
-
The process is done in the same order of our previous Vulkan-Hpp based code. It enumerates the physical devices and "check" some availability of them, with following criteria:
- Queue family requirements: for each physical device, vku tried to construct the given
QueueFamilies
struct with it. If error thrown, skip it from the candidates. - Device extensions requirements: check if the required extensions are supported. If not, skip it from the candidates.
- Physical device features requirements: check if the required physical device features (
vk::PhysicalDeviceFeatures
) are supported. If not, skip it from the candidates. - For remaining physical devices candidates, they are rated by their properties, which has the same rule of vulkan-tutorial's. The best scored physical device is selected.
- Queue family requirements: for each physical device, vku tried to construct the given
-
After the physical device selection,
QueueFamilies
struct is constructed with the selected physical device.vku::getGraphicsQueueFamily
is a helper function that accepts a physical device's queue family properties and returns the queue family index that supports the graphics operation. If there's no such queue family, it returnsstd::nullopt
. -
After that, device is created with
Config
's device extensions and queue create infos provided byQueues::getCreateInfos
. This function MUST be implemented by the user, and meets the following requirements:- It's signature must be
getCreateInfos(vk::PhysicalDevice, const QueueFamilies&)
and be static. - It must return
vku::RefHolder
of either a singlevk::DeviceQueueCreateInfo
struct or contiguousvk::DeviceQueueCreateInfo
container.vku::RefHolder<T, Args...>
is a utility class, a container that can be contextually converted to a reference of typeT
, which has references for objects ofArgs...
. In this code, since the returningvk::DeviceQueueCreateInfo
needs the reference of queue priority value, it is wrapped withvku::RefHolder
.
- It's signature must be
-
After the device creation, the queues are gotten by constructing
Queues
with the device and queue families. This is implemented by user side. -
Finally, it creates
vma::Allocator
, which will be used as resource allocator such like buffer and image.
If you pass verbose
boolean field as true
to the Gpu
constructor, physical device selection process (including rejection reason and score of the accepted physical device) is printed to std::cerr
.
Example
For example, the following (more sophisticated GPU selection) codevku::Gpu<QueueFamilies, Queues> gpu { instance, vku::Gpu<QueueFamilies, Queues>::Config {
.verbose = true,
.deviceExtensions = {
#if __APPLE__
vk::KHRPortabilitySubsetExtensionName,
#endif
vk::KHRMaintenance4ExtensionName,
vk::EXTDescriptorBufferExtensionName,
vk::KHRDynamicRenderingLocalReadExtensionName,
},
.physicalDeviceFeatures = vk::PhysicalDeviceFeatures{}
.setGeometryShader(true)
.setIndependentBlend(true),
.devicePNexts = std::tuple {
vk::PhysicalDeviceDescriptorBufferFeaturesEXT{}
.setDescriptorBuffer(true)
.setDescriptorBufferCaptureReplay(true),
vk::PhysicalDeviceDynamicRenderingLocalReadFeaturesKHR{}
.setDynamicRenderingLocalRead(true),
},
.apiVersion = vk::makeApiVersion(0, 1, 3, 0),
} };
The possible output is:
Physical device "Intel UHD Graphics 4000" rejected because it lacks the following device extensions: [VK_KHR_dynamic_rendering_local_read, VK_KHR_maintenance4]
Physical device "NVIDIA MX 150" rejected because it lacks the following physical device features: [geometryShader]
Physical device "NVIDIA RTX 2080" accepted (score=16384).
Like many other vku objects, vku::Gpu
follows the RAII convention, and it automatically destroys the created device and allocator when it is destroyed.
Now, we're going create the 512x512 image that will be used for the rendering attachment. For this step, I'll just briefly explain how the conventional Vulkan-Hpp code does, because it is too complex to be in this tutorial. It would be:
- Create
vk::raii::Image
object (let itimage
) withgpu.device
andvk::ImageCreateInfo
struct. - Create
vk::raii::DeviceMemory
object withgpu.device
andimage
's allocation size, memory alignment, and memory properties for color attachment usage. - Bind the
image
to the device memory.
which involves a lot of headache steps.
As explained in above, vku uses Vulkan Memory Allocator (VMA) for primary memory management. VMA is a library that provides a simple and efficient way to manage the Vulkan memory allocation, and it's widely used in the Vulkan industry. vku uses the Vulkan-Hpp version of VMA, VulkanMemoryAllocator-Hpp, by YaaZ.
The main allocator object, vma::Allocator
, is inside vku::Gpu
class as the public field. With this, your image generation get much simpler:
int main() {
...
const vku::Gpu<QueueFamilies, Queues> gpu { ... };
const vku::AllocatedImage image { gpu.allocator, vk::ImageCreateInfo {
{},
vk::ImageType::e2D,
vk::Format::eR8G8B8A8Unorm,
vk::Extent3D { 512, 512, 1 },
1, // mipLevels
1, // arrayLayers
vk::SampleCountFlagBits::e1, // samples
vk::ImageTiling::eOptimal,
vk::ImageUsageFlagBits::eColorAttachment,
} };
}
vku::AllocatedImage
is an object that owns vk::Image
and vma::Allocation
objects. It follows the RAII convention, and automatically destroys them when it is destroyed. It can be constructed by
vma::Allocator
(used by GPU resource allocation)vk::ImageCreateInfo
(for image creation),- and optional
vma::AllocationCreateInfo
(allocation creation property, detailed explanation is in VMA documentation) which is defaulted by no flags and memory usage with device preferred (vma::AllocationCreateInfo { {}, vma::MemoryUsage::eAutoPreferDevice }
)
vku::AllocatedImage
inherits its base class vku::Image
, which is a non-owning object that have vk::Image
handle and its properties (extent, format, mip levels and array layers) that were used at the initialization.
Due to this inheritance, you can take three advantages:
vku::Image
objects can be contextually converted intovk::Image
and sovku::AllocatedImage
. Therefore, you can use them asvk::Image
objects without any explicit conversion.- You can access to some frequently used image properties (listed in above) after image creation.
vku::Image
can be constructed with rawvk::Image
with designated initializer, which makes your existing codebase adopting to vku easier.
After the image creation, we have to create the image view for it.
int main() {
const vku::AllocatedImage image { ... };
const vk::raii::ImageView imageView { gpu.device, vk::ImageViewCreateInfo {
{}, // flags
image,
vk::ImageViewType::e2D,
vk::Format::eR8G8B8A8Unorm,
{}, // componentMapping
vk::ImageSubresourceRange {
vk::ImageAspectFlagBits::eColor,
0, // baseMipLevel
1, // levelCount
0, // baseArrayLayer
1, // layerCount
}, // subresourceRange
} };
}
You can see image
is implicitly converted to vk::Image
.
Since vku::Image
has much more information than vk::Image
(in this case, it knows its format), it has getViewCreateInfo
method, which uses the format and inferred aspect flags from the format. Regarding this, the code can be:
int main() {
const vku::AllocatedImage image { ... };
const vk::raii::ImageView imageView { gpu.device, image.getViewCreateInfo() };
}
getViewCreateInfo
method has two overloads:
getViewCreateInfo(vk::ImageViewType type = vk::ImageViewType::e2D) -> vk::ImageViewCreateInfo
It creates thevk::ImageViewCreateInfo
with given image view type (default isvk::ImageViewType::e2D
), with full subresource range (all mip levels and array layers) and inferred aspect flags.getViewCreateInfo(const vk::ImageSubresourceRange &subresourceRange, vk::ImageViewType type = vk::ImageViewType::e2D) -> vk::ImageViewCreateInfo
It creates thevk::ImageViewCreateInfo
with given image view type and subresource range. If you need the specific subresource range, you can use this overload.
Both functions use identity component mapping.
To create a graphics pipeline, we have to:
- Create a pipeline layout.
- Create shader modules from SPIR-V binaries.
- Create a graphics pipeline using pipeline layout and shader modules.
Pipeline layout creation is easy: we don't have any descriptor set layout or push constant now, therefore just passing a default vk::PipelineLayoutCreateInfo
is enough.
int main() {
...
const vk::raii::ImageView imageView { ... };
const vk::raii::PipelineLayout pipelineLayout { gpu.device, vk::PipelineLayoutCreateInfo{} };
}
For this step, we have to write the shader code. Here's the shader code:
shaders/triangle.vert
#version 460
const vec2 positions[] = {
{ -0.5, 0.5 },
{ 0.5, 0.5 },
{ 0.0, -0.5 },
};
const vec3 colors[] = {
{ 1.0, 0.0, 0.0 },
{ 0.0, 1.0, 0.0 },
{ 0.0, 0.0, 1.0 },
};
layout (location = 0) out vec3 fragColor;
void main(){
fragColor = colors[gl_VertexIndex];
gl_Position = vec4(positions[gl_VertexIndex], 0.0, 1.0);
}
Note
We used the GLSL's initializer list-based initialization, which is supported from GLSL 4.6.
shaders/triangle.frag
#version 450
layout (location = 0) in vec3 fragColor;
layout (location = 0) out vec4 outColor;
void main(){
outColor = vec4(fragColor, 1.0);
}
After the shader writing, you can compile them into SPIR-V binary by executing the following command:
glslc triangle.vert -o triangle.vert.spv
glslc triangle.frag -o triangle.frag.spv
Tip
You don't have to specify the shader type, since their extension is vert
and frag
, which is known as the vertex and fragment shader. glslc
can automatically infer the shader stage from an extension.
Back to the C++ side, you have to load the SPIR-V binary and create vk::raii::ShaderModule
for each stage. I'll first define the macro variable COMPILED_SHADER_DIR
for the shader directory path to prevent some relative path issue.
CMakeLists.txt
...
target_link_libraries(vku-tutorial PRIVATE vku::vku)
target_compile_definitions(vku-tutorial PRIVATE
COMPILED_SHADER_DIR="${CMAKE_CURRENT_SOURCE_DIR}/shaders"
)
main.cpp
[[nodiscard]] auto loadFileAsBinary(const std::filesystem::path &path) -> std::vector<std::uint32_t> {
std::ifstream file { path, std::ios::binary };
if (!file) {
throw std::runtime_error { "Failed to open file!" };
}
file.seekg(0, std::ios::end);
const auto fileSize = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<std::uint32_t> result(fileSize / sizeof(std::uint32_t));
file.read(reinterpret_cast<char*>(result.data()), fileSize);
return result;
}
int main() {
...
const vk::raii::PipelineLayout pipelineLayout { ... };
const vk::raii::ShaderModule vertexShaderModule { gpu.device, vk::ShaderModuleCreateInfo {
{}, // flags
vku::unsafeProxy(loadFileAsBinary(COMPILED_SHADER_DIR "/triangle.vert.spv")),
} };
const vk::raii::ShaderModule fragmentShaderModule { gpu.device, vk::ShaderModuleCreateInfo {
{},
vku::unsafeProxy(loadFileAsBinary(COMPILED_SHADER_DIR "/triangle.frag.spv")),
} };
}
Finally, the most cumbersome part of the Vulkan API, creating the graphics pipeline. Here's the code by using Vulkan-Hpp naively:
int main() {
const vk::raii::ShaderModule vertexShaderModule { ... };
const vk::raii::ShaderModule fragmentShaderModule { ... };
const vk::raii::Pipeline pipeline { gpu.device, nullptr, vk::StructureChain {
vk::GraphicsPipelineCreateInfo {
{}, // flags
vku::unsafeProxy({
vk::PipelineShaderStageCreateInfo {
{}, // flags
vk::ShaderStageFlagBits::eVertex,
*vertexShaderModule,
"main", // pName
},
vk::PipelineShaderStageCreateInfo {
{}, // flags
vk::ShaderStageFlagBits::eFragment,
*fragmentShaderModule,
"main", // pName
},
}), // stages
vku::unsafeAddress(vk::PipelineVertexInputStateCreateInfo{}), // pVertexInputState
vku::unsafeAddress(vk::PipelineInputAssemblyStateCreateInfo {
{}, // flags
vk::PrimitiveTopology::eTriangleList,
}), // pInputAssemblyState
{}, // pTessellationState
vku::unsafeAddress(vk::PipelineViewportStateCreateInfo {
{}, // flags
vku::unsafeProxy(vku::toViewport(vku::toExtent2D(image.extent))), // viewports
vku::unsafeProxy(vk::Rect2D { { 0, 0 }, vku::toExtent2D(image.extent) }), // scissors
}), // pViewportState
vku::unsafeAddress(vk::PipelineRasterizationStateCreateInfo {
{}, // flags
false, // depthClampEnable
false, // rasterizerDiscardEnable
vk::PolygonMode::eFill,
vk::CullModeFlagBits::eNone,
{}, // frontFace
{}, // depthBiasEnable
{}, // depthBiasConstantFactor
{}, // depthBiasClamp
{}, // depthBiasSlopeFactor
1.f, // lineWidth
}), // pRasterizationState
vku::unsafeAddress(vk::PipelineMultisampleStateCreateInfo {
{}, // flags
vk::SampleCountFlagBits::e1, // rasterizationSamples
}), // pMultisampleState
{}, // pDepthStencilState
vku::unsafeAddress(vk::PipelineColorBlendStateCreateInfo {
{}, // flags
false, // logicOpEnable
{}, // logicOp
vku::unsafeProxy(vk::PipelineColorBlendAttachmentState {
false, // blendEnable
{}, {}, {}, // srcColorBlendFactor, dstColorBlendFactor, colorBlendOp
{}, {}, {}, // srcAlphaBlendFactor, dstAlphaBlendFactor, alphaBlendOp
vk::ColorComponentFlagBits::eR | vk::ColorComponentFlagBits::eG | vk::ColorComponentFlagBits::eB | vk::ColorComponentFlagBits::eA, // colorWriteMask
}), // attachments
}),
vku::unsafeAddress(vk::PipelineDynamicStateCreateInfo{}), // pDynamicState
*pipelineLayout,
},
vk::PipelineRenderingCreateInfo {
{}, // viewMask
image.format, // colorAttachementFormats
},
}.get() };
}
Since the extent of the rendering attachment is known (512x512) and fixed, we embedded the viewport and scissor into the pipeline creation. (For most case, you'll use the dynamic state for it!) Also, we used Vulkan-Hpp provided vk::StructureChain
, which concatenates the Vulkan structure objects' pNext chain.
I think it is worth to show what if we don't use `vku::unsafeProxy` and `vku::unsafeAddress`.
int main() {
const std::array stages {
vk::PipelineShaderStageCreateInfo {
{},
vk::ShaderStageFlagBits::eVertex,
*vertexShaderModule,
"main",
},
vk::PipelineShaderStageCreateInfo {
{},
vk::ShaderStageFlagBits::eFragment,
*fragmentShaderModule,
"main",
},
};
constexpr vk::PipelineVertexInputStateCreateInfo vertexInputState{};
constexpr vk::PipelineInputAssemblyStateCreateInfo inputAssemblyState {
{},
vk::PrimitiveTopology::eTriangleList,
};
const vk::Viewport viewport {
0.f,
0.f,
static_cast<float>(image.extent.width),
static_cast<float>(image.extent.height),
0.f,
1.f,
};
const vk::Rect2D scissor {
{ 0, 0 },
{ image.extent.width, image.extent.height },
};
const vk::PipelineViewportStateCreateInfo viewportState {
{},
viewport,
scissor,
};
constexpr vk::PipelineRasterizationStateCreateInfo rasterizationState {
{},
false,
false,
vk::PolygonMode::eFill,
vk::CullModeFlagBits::eNone,
{},
{},
{},
{},
{},
1.f,
};
constexpr vk::PipelineMultisampleStateCreateInfo multisampleState {
{},
vk::SampleCountFlagBits::e1,
};
constexpr vk::PipelineColorBlendAttachmentState colorBlendAttachmentState {
false,
{}, {}, {},
{}, {}, {},
vk::ColorComponentFlagBits::eR | vk::ColorComponentFlagBits::eG | vk::ColorComponentFlagBits::eB | vk::ColorComponentFlagBits::eA,
};
const vk::PipelineColorBlendStateCreateInfo colorBlendState {
{},
false,
{},
colorBlendAttachmentState,
};
constexpr vk::PipelineDynamicStateCreateInfo dynamicState{};
const vk::raii::Pipeline pipeline { gpu.device, nullptr, vk::StructureChain {
vk::GraphicsPipelineCreateInfo {
{},
stages,
&vertexInputState,
&inputAssemblyState,
nullptr,
&viewportState,
&rasterizationState,
&multisampleState,
nullptr,
&colorBlendState,
&dynamicState,
*pipelineLayout,
},
vk::PipelineRenderingCreateInfo {
{},
image.format,
},
}.get() };
}
Since the fields cannot be inlined, logic flow direction is not directly forward, and the code is less readable.
Such a hard work! For real application, you would make hundred or thousand pipelines with different shader combinations. Repeating this would blow your head. Let's see how vku handles this.
vku combines the shader module loading, generation and vk::GraphicsPipelineCreateInfo
generation into a single function.
int main() {
const vk::raii::PipelineLayout pipelineLayout { ... };
const vk::raii::Pipeline pipeline { gpu.device, nullptr, vk::StructureChain {
vku::getDefaultGraphicsPipelineCreateInfo(
vku::createPipelineStages(
gpu.device,
vku::Shader::fromSpirvFile(COMPILED_SHADER_DIR "/triangle.vert.spv" /* path */, vk::ShaderStageFlagBits::eVertex),
vku::Shader::fromSpirvFile(COMPILED_SHADER_DIR "/triangle.frag.spv" /* path */, vk::ShaderStageFlagBits::eFragment)).get(),
*pipelineLayout, 1 /* colorAttachmentCount */) // stages
.setPRasterizationState(vku::unsafeAddress(vk::PipelineRasterizationStateCreateInfo {
{},
false, false,
vk::PolygonMode::eFill,
vk::CullModeFlagBits::eNone, {},
{}, {}, {}, {},
1.f,
}))
.setPViewportState(vku::unsafeAddress(vk::PipelineViewportStateCreateInfo {
{},
vku::unsafeProxy(vku::toViewport(vku::toExtent2D(image.extent))),
vku::unsafeProxy(vk::Rect2D { { 0, 0 }, vku::toExtent2D(image.extent) }),
}))
.setPDynamicState(vku::unsafeAddress(vk::PipelineDynamicStateCreateInfo{})),
vk::PipelineRenderingCreateInfo {
{},
image.format,
},
}.get() };
}
What happened? Let's see the explanation:
-
vku::getDefaultGraphicsPipelineCreateInfo
is the function that returns a "standard"vk::GraphicsPipelineCreateInfo
. The term "standard" means the most commonly used settings, which are:- No vertex input state.
- Input assembly state with triangle list topology and no primitive restart.
- Single viewport and scissor with unspecified value, i.e. are set by dynamic state.
- Rasterization state with fill mode and backface culling, line width
1.0
. - No depth stencil state.
- No color blending for all color attachments.
- Dynamic state enabled with viewport and scissor.
Its function signature is
getDefaultGraphicsPipelineCreateInfo( vk::ArrayProxyNoTemporaries<const vk::PipelineShaderStageCreateInfo> stages, vk::PipelineLayout layout, std::uint32_t colorAttachmentCount = 0, bool hasDepthStencilAttachment = false, vk::SampleCountFlagBits multisample = vk::SampleCountFlagBits::e1 ) -> vk::GraphicsPipelineCreateInfo
Each parameter represents:
stages
: Array ofvk::PipelineShaderStageCreateInfo
s. Since shader is mandatory for Vulkan pipeline creation (among the graphics and compute pipeline), it is required parameter.layout
: Pipeline layout object, also mandatory.colorAttachmentCount
: The count of the color attachments. It is used for the color blend attachment state array size. For now, we only have one color attachment, so it is1
. It is defaulted by0
(no color attachment).hasDepthStencilAttachment
: Boolean flag for whether pipeline has depth stencil attachment. For now, we don't have depth stencil attachment, so itfalse
(and omitted since it is defaulted byfalse
).multisample
: The sample count of the rasterization. Since we're not using MSAA for now, it isvk::SmapleCountFlagBits::e1
(and omitted since it is default).
Most of your use case would not heavily vary from this standard pipeline setting, but some of them would. For such cases, you can modify the returned
vk::GraphicsPipelineCreateInfo
manually by using the Vulkan-Hpp's setter methods with builder pattern. In above code, we modified the rasterization/viewport/dynamic state by usingsetPRasterizationState
,setPViewportState
, andsetPDynamicState
methods. -
vku::createPipelineStages
is the function that creates theRefHolder
ofvk::PipelineShaderStageCreateInfo
array from the shader path/code/raw GLSL string. It accepts the device and an arbitrary number ofvku::Shader
s that have the shader code and stage flag. The function signature iscreatePipelineStages( const vk::raii::Device &device, const Shaders &...shaders ) -> RefHolder<std::array<vk::PipelineShaderStageCreateInfo, sizeof...(Shaders)>, std::array<vk::raii::ShaderModule, sizeof...(Shaders)>>
Each parameter represents:
device
: The Vulkan device which creates the shader modules.shaders
: Thevku::Shader
objects that represent the shader codes and their infos, such like shader stage (vk::ShaderStageFlagBits
), specialization constants (nullptr
by default) and entry point ("main"
by default). You can construct the struct by existing SPIR-V binary (any contiguous range ofstd::uint32_t
s that can be convertible tostd::span<const std::uint32_t>
), or you may use a convenient functions likevku::Shader::fromSpirvFile
(load SPIR-V compiled binary from the file directly) or
And since the result is
vku::RefHolder
of thevk::PipelineShaderStageCreateInfo
array, you can finally get the lvalue reference of inner stage create infos by calling theget()
method.Thanks to the shaderc, you may just load the raw GLSL string and compile it to the SPIR-V code at runtime. Here's the example:
vcpkg.json
{ "dependencies": [ { "name": "vku", "features": [ "shaderc" ] } ] }
main.cpp
#include <shaderc/shaderc.hpp> const shaderc::Compiler compiler; vku::getGraphicsPipelineCreateInfo( vku::createPipelineStages( gpu.device, vku::Shader::fromGLSLFile(compiler, COMPILED_SHADER_DIR "/triangle.vert" /* path */, {} /* compile option */, vk::ShaderStageFlagBits::eVertex), vku::Shader::fromGLSLFile(compiler, COMPILED_SHADER_DIR "/triangle.frag" /* path */, {} /* compile option */, vk::ShaderStageFlagBits::eFragment) ).get(), *pipelineLayout, 1) ... // Other pipeline settings
-
Some vku's utility functions are used:
vku::toExtent2D(const vk::Extent3D &extent)
: drop thedepth
field of theextent
and return it. Useful when you're working with 2D image (whosedepth
is1
).vku::toViewport(const vk::Extent2D &extent, bool negativeViewport = false)
: createvk::Viewport
with bound[0, 0]x[extent.width, extent.height]
and depth[0, 1]
. IfnegativeViewport
istrue
, the viewport's height is flipped (i.e.[extent.height, 0]
).
Warning
These are common pitfalls of getDefaultGraphicsPipelineCreateInfo
usage.
- The default cull mode is backface culling. If you want to disable the culling, you have to modify the returned struct manually.
- Even if you passed the
hasDepthStencilAttachment
parameter totrue
, depth testing/writing is not enabled. For this, you have to modify the returned struct manually. - vku stores these standard graphics pipeline properties into the static storage, therefore you don't have to worry about the returned
vk::GraphicsPipelineCreateInfo
's lifetime validity. Due to this approach, the maximum number of color attachments is limited to8
. I don't think this is a big issue, but you can modify theMAX_COLOR_ATTACHMENT_COUNT
if you want to increase the limit.
Note
vku::Shader
neither represents the vk::ShaderModule
, nor constructs the shader module. It only holds the SPIR-V shader code and its info. The shader module is created by calling vku::createPipelineStages
function.
Note
Runtime GLSL compilation feature is not enabled by default. You have to manually set the CMake variable VKU_USE_SHADERC
to ON
, or specify the port feature shaderc
if you're using vcpkg.
If you launch your application now, it will emit the validation error:
Show errors
VUID-VkGraphicsPipelineCreateInfo-pNext-pNext(ERROR / SPEC): msgNum: 395890785 - Validation Error: [ VUID-VkGraphicsPipelineCreateInfo-pNext-pNext ] Object 0: handle = 0x121008200, type = VK_OBJECT_TYPE_INSTANCE; | MessageID = 0x1798d061 | vkCreateGraphicsPipelines(): pCreateInfos[0].pNext includes a pointer to a VkStructureType (VK_STRUCTURE_TYPE_PIPELINE_RENDERING_CREATE_INFO), but its parent extension VK_KHR_dynamic_rendering has not been enabled. The Vulkan spec states: Each pNext member of any structure (including this one) in the pNext chain must be either NULL or a pointer to a valid instance of VkAttachmentSampleCountInfoAMD, VkExternalFormatANDROID, VkGraphicsPipelineLibraryCreateInfoEXT, VkGraphicsPipelineShaderGroupsCreateInfoNV, VkMultiviewPerViewAttributesInfoNVX, VkPipelineCompilerControlCreateInfoAMD, VkPipelineCreateFlags2CreateInfoKHR, VkPipelineCreationFeedbackCreateInfo, VkPipelineDiscardRectangleStateCreateInfoEXT, VkPipelineFragmentShadingRateEnumStateCreateInfoNV, VkPipelineFragmentShadingRateStateCreateInfoKHR, VkPipelineLibraryCreateInfoKHR, VkPipelineRenderingCreateInfo, VkPipelineRepresentativeFragmentTestStateCreateInfoNV, VkPipelineRobustnessCreateInfoEXT, VkRenderingAttachmentLocationInfoKHR, or VkRenderingInputAttachmentIndexInfoKHR (https://vulkan.lunarg.com/doc/view/1.3.283.0/mac/1.3-extensions/vkspec.html#VUID-VkGraphicsPipelineCreateInfo-pNext-pNext) Objects: 1 [0] 0x121008200, type: 1, name: NULL VUID-VkGraphicsPipelineCreateInfo-dynamicRendering-06576(ERROR / SPEC): msgNum: 264898451 - Validation Error: [ VUID-VkGraphicsPipelineCreateInfo-dynamicRendering-06576 ] | MessageID = 0xfca0793 | vkCreateGraphicsPipelines(): pCreateInfos[0].renderPass is NULL, but the dynamicRendering feature was not enabled. The Vulkan spec states: If the dynamicRendering feature is not enabled and the pipeline requires pre-rasterization shader state, fragment shader state, or fragment output interface state, renderPass must not be VK_NULL_HANDLE (https://vulkan.lunarg.com/doc/view/1.3.283.0/mac/1.3-extensions/vkspec.html#VUID-VkGraphicsPipelineCreateInfo-dynamicRendering-06576) Objects: 0 Process finished with exit code 134 (interrupted by signal 6:SIGABRT)
Since VK_KHR_dynamic_rendering
extension core from Vulkan 1.3, and we're using Vulkan 1.0, the extension have to be explicitly specified at the device creation. Also, you have to pass the vk::PhysicalDeviceDynamicRenderingFeatures
struct into vk::DeviceCreateInfo
's pNext
chain.
Setting device extension is already explained in above. How we can specify the vk::DeviceCreateInfo
's pNext
? vku provides these mechanism with compile-time safe way. You can specify the vk::DeviceCreateInfo
's pNext
chain by using the vku::Gpu::Config
's devicePNexts
field. Here's the code:
const vku::Gpu<QueueFamilies, Queues> gpu { instance, vku::Gpu<QueueFamilies, Queues>::Config {
.verbose = true,
.deviceExtensions = {
vk::KHRMaintenance2ExtensionName, // for VK_KHR_depth_stencil_resolve
vk::KHRCreateRenderpass2ExtensionName, // for VK_KHR_depth_stencil_resolve
vk::KHRDepthStencilResolveExtensionName, // for VK_KHR_multiview
vk::KHRMultiviewExtensionName, // for VK_KHR_dynamic_rendering
vk::KHRDynamicRenderingExtensionName,
#if __APPLE__
vk::KHRPortabilitySubsetExtensionName,
#endif
},
.devicePNexts = std::tuple {
vk::PhysicalDeviceDynamicRenderingFeatures { true },
},
} };
You can pass the tuple of Vulkan structures that are used for the pNext
chain of vk::DeviceCreateInfo
. vku will enclose the concatenated vk::DeviceCreateInfo
and this tuple into vk::StructureChain
, so you can take the advantage of compile time pNext chain validation.
Note
vku::Gpu<QueueFamilies, Queues>::Config
is a templated class, whose template types are tuple alternatives. Therefore, depending on your compiler's C++23 implementation status, these type deduction may not work (especially for MSVC). For this, you have to manually specify the tuple alternative types, like: vku::Gpu<QueueFamilies, Queues>::Config<vk::PhysicalDeviceDynamicRenderingFeatures>
.
After enabling the dynamic rendering feature, running your application will not emit the validation error anymore.
For this step, we have to create the command buffer, record the rendering commands, and submit the command buffer into the graphics queue. Here's the Vulkan-Hpp code:
int main() {
const vk::raii::Pipeline pipeline { ... };
const vk::raii::CommandPool graphicsCommandPool { gpu.device, vk::CommandPoolCreateInfo {
{}, // flags
gpu.queueFamilies.graphics,
} };
const vk::CommandBuffer commandBuffer = (*gpu.device).allocateCommandBuffers(vk::CommandBufferAllocateInfo {
*graphicsCommandPool,
vk::CommandBufferLevel::ePrimary,
1, // commandBufferCount
})[0];
commandBuffer.begin({ vk::CommandBufferUsageFlagBits::eOneTimeSubmit /* flags */ });
// Change the layout of image to ColorAttachmentOptimal.
commandBuffer.pipelineBarrier(
vk::PipelineStageFlagBits::eTopOfPipe, // srcStageMask
vk::PipelineStageFlagBits::eColorAttachmentOutput, // dstStageMask
{}, {}, {}, // dependencyFlags, memoryBarriers, bufferMemoryBarriers
vk::ImageMemoryBarrier {
{}, // srcAccessMask
vk::AccessFlagBits::eColorAttachmentWrite, // dstAccessMask
{}, // oldLayout
vk::ImageLayout::eColorAttachmentOptimal, // newLayout
vk::QueueFamilyIgnored, // srcQueueFamilyIndex
vk::QueueFamilyIgnored, // dstQueueFamilyIndex
image, vku::fullSubresourceRange(),
});
// Dynamic rendering.
commandBuffer.beginRenderingKHR({
{}, // flags
{ { 0, 0 }, vku::toExtent2D(image.extent) }, // renderArea
1, // layerCount
0, // viewMask
vku::unsafeProxy(vk::RenderingAttachmentInfo {
*imageView, vk::ImageLayout::eColorAttachmentOptimal,
{}, {}, {}, // resolveMode, resolveImageView, resolveImageLayout
vk::AttachmentLoadOp::eClear,vk::AttachmentStoreOp::eStore, vk::ClearColorValue { 0.f, 0.f, 0.f, 0.f },
}), // colorAttachments
}, *gpu.device.getDispatcher());
// Draw triangle.
commandBuffer.bindPipeline(vk::PipelineBindPoint::eGraphics, *pipeline);
commandBuffer.draw(3 /* vertexCount */, 1 /* instanceCount */, 0 /* firstVertex */, 0 /* firstInstance */);
commandBuffer.endRenderingKHR(*gpu.device.getDispatcher());
commandBuffer.end();
gpu.queues.graphics.submit(vk::SubmitInfo {
{}, // waitSemaphores
{}, // waitDstStageMask
commandBuffer, // commandBuffers
});
gpu.queues.graphics.waitIdle();
}
If you're not familiar about how to use the VK_KHR_dynamic_rendering
extension, see Vulkan documentation for more detailed explanation.
We allocated the command buffer from graphics command pool, change the image layout for color attachment usage, and do triangle rendering with the dynamic rendering scope. As mentioned above, since dynamic rendering feature is not core until Vulkan 1.3, you have to pass the dispatcher object to the beginRenderingKHR
and endRenderingKHR
functions (which can be omitted if you're using Vulkan 1.3).
Tip
You can set VKU_DEFAULT_DYNAMIC_DISPATCHER
CMake variable at the configuration time, or add dynamic-dispatcher
feature when use vcpkg to make Vulkan-Hpp uses dynamically loaded function pointers. For this feature, see Using vku page and Vulkan-Hpp documentation for the details. With dynamic dispatch loader, you can omit the last parameter *device.getDispatcher()
.
vku::fullSubresourceRange(vk::ImageAspectFlags aspectFlags = vk::ImageApsectFlagBits::eColor)
is vku's utility function that generates the whole subresource region (all mip levels and array layers) with the given aspect flags.
Not a big deal, but vku provides you a simpler mechanism for one-time command buffer allocation and submission, by vku::executeSingleCommand
. Here's vku approach:
int main() {
const vk::raii::Pipeline pipeline { ... };
const vk::raii::CommandPool graphicsCommandPool { gpu.device, vk::CommandPoolCreateInfo { {}, gpu.queueFamilies.graphics } };
vku::executeSingleCommand(*gpu.device, *graphicsCommandPool, gpu.queues.graphics, [&](vk::CommandBuffer cb) {
cb.pipelineBarrier(
vk::PipelineStageFlagBits::eTopOfPipe, vk::PipelineStageFlagBits::eColorAttachmentOutput,
{}, {}, {},
vk::ImageMemoryBarrier {
{}, vk::AccessFlagBits::eColorAttachmentWrite,
{}, vk::ImageLayout::eColorAttachmentOptimal,
vk::QueueFamilyIgnored, vk::QueueFamilyIgnored,
image, vku::fullSubresourceRange(),
});
cb.beginRenderingKHR({
{},
{ { 0, 0 }, vku::toExtent2D(image.extent) },
1,
0,
vku::unsafeProxy(vk::RenderingAttachmentInfo {
*imageView, vk::ImageLayout::eColorAttachmentOptimal,
{}, {}, {},
vk::AttachmentLoadOp::eClear, vk::AttachmentStoreOp::eStore, vk::ClearColorValue { 0.f, 0.f, 0.f, 0.f },
}),
}, *gpu.device.getDispatcher());
cb.bindPipeline(vk::PipelineBindPoint::eGraphics, *pipeline);
cb.draw(3, 1, 0, 0);
cb.endRenderingKHR(*gpu.device.getDispatcher());
cb.pipelineBarrier(
vk::PipelineStageFlagBits::eColorAttachmentOutput, vk::PipelineStageFlagBits::eTransfer,
{}, {}, {},
vk::ImageMemoryBarrier {
vk::AccessFlagBits::eColorAttachmentWrite, vk::AccessFlagBits::eTransferRead,
vk::ImageLayout::eColorAttachmentOptimal, vk::ImageLayout::eTransferSrcOptimal,
vk::QueueFamilyIgnored, vk::QueueFamilyIgnored,
image, vku::fullSubresourceRange(),
});
cb.copyImageToBuffer(
image, vk::ImageLayout::eTransferSrcOptimal,
destagingBuffer,
vk::BufferImageCopy {
0, 0, 0,
{ vk::ImageAspectFlagBits::eColor, 0, 0, 1 },
vk::Offset3D{}, image.extent,
});
});
gpu.queues.graphics.waitIdle();
}
Command buffer allocation and submission code is gone. vku::executeSingleCommand
automatically allocates a command buffer from the given command pool, begins before/ends after the given command buffer recording function execution, and submits the command buffer into the given queue.
For optimization, since it knows only a single command buffer have to be allocated (allocation count is known at the compile time), it uses vku::allocateCommandBuffers<N>
to avoid the heap allocation.
You can optionally pass the vk::Fence
handle to the last parameter. With fence, it signaled when the submitted command buffer execution finished. Otherwise, vku does not handle any synchronization, therefore we should wait for the graphics queue to be idle manually.
We reached our goal now! The application will render the colored triangle into image
. However, you can't see the result because there's no mechanism for image visualization or persisting it into the file. We'll implement it in the next section, but for the intermediate check, you can use the Vulkan debugger tools like RenderDoc, Nsight Graphics or Xcode Metal Debugger (if you're running it with MoltenVK).
Note
This section is MoltenVK specific.
To use the Metal Debugger, I'll run the application with the following environment variables, explained in the MoltenVK Runtime User Guide.
METAL_CAPTURE_ENABLED=1
MVK_CONFIG_AUTO_GPU_CAPTURE_SCOPE=1
MVK_CONFIG_AUTO_GPU_CAPTURE_OUTPUT_FILE=output.gputrace
After the application execution, output.gputrace
file will be created in the executable folder. You can see the result image by opening it.
If you can't use the GPU debugging tool, you can de-stage the image data into the host accessible buffer and save it into the file. With Vulkan-Hpp, the step would be:
- Creating a de-staging buffer with proper allocation size.
- Append the copy command to the previous rendering commands (and executing it).
- Map the de-staging buffer memory into CPU address and read it.
- Save the fetched data into file (uses a function like
stbi_write_png
).
Alike vku::AllocatedImage
, vku also offers you to use RAII wrapped buffer objects. But in this case, you have two options: vku::AllocatedBuffer
and vku::MappedBuffer
. Here's the classes' hierarchy:
vku::Buffer
and vku::AllocatedBuffer
's relation has the same manner as vku::Image
and vku::AllocatedImage
. vku::MappedBuffer
is type of vku::AllocatedBuffer
(it owns the allocation), but one thing is different: it is persistently mapped buffer during its lifetime (mapped at construction, unmapped at destruction). Since the buffer data can be viewed by the host side, it exposes the methods about viewing its data by arbitrary type of reference (asValue<T>
) or contiguous range (asRange<T>
). It mostly used as host-device data transfer.
Since we're going to use the buffer for the de-staging purpose, we'll use the vku::MappedBuffer
for this step. Here's the code:
int main() {
const vku::Gpu<QueueFamilies, Queues> gpu { ... };
const vku::AllocatedImage image { gpu.allocator, vk::ImageCreateInfo {
{},
vk::ImageType::e2D,
vk::Format::eR8G8B8A8Unorm,
vk::Extent3D { 512, 512, 1 },
1, 1,
vk::SampleCountFlagBits::e1,
vk::ImageTiling::eOptimal,
vk::ImageUsageFlagBits::eColorAttachment | vk::ImageUsageFlagBits::eTransferSrc, // <- TransferSrc added
} };
const vk::raii::ImageView imageView { ... };
...
const vk::raii::Pipeline pipeline { ... };
const vku::MappedBuffer destagingBuffer { gpu.allocator, vk::BufferCreateInfo {
{},
blockSize(image.format) * image.extent.width * image.extent.height,
vk::BufferUsageFlagBits::eTransferDst,
}, vku::allocation::hostRead }; // <- Buffer declaration.
const vk::raii::CommandPool graphicsCommandPool { gpu.device, vk::CommandPoolCreateInfo { {}, gpu.queueFamilies.graphics } };
vku::executeSingleCommand(*gpu.device, *graphicsCommandPool, gpu.queues.graphics, [&](vk::CommandBuffer cb) {
...
cb.endRenderingKHR(*gpu.device.getDispatcher());
// New code starts from here:
// Change the layout of image for transfer source.
cb.pipelineBarrier(
vk::PipelineStageFlagBits::eColorAttachmentOutput, vk::PipelineStageFlagBits::eTransfer,
{}, {}, {},
vk::ImageMemoryBarrier {
vk::AccessFlagBits::eColorAttachmentWrite, vk::AccessFlagBits::eTransferRead,
vk::ImageLayout::eColorAttachmentOptimal, vk::ImageLayout::eTransferSrcOptimal,
vk::QueueFamilyIgnored, vk::QueueFamilyIgnored,
image, vku::fullSubresourceRange(),
});
cb.copyImageToBuffer(
image, vk::ImageLayout::eTransferSrcOptimal,
destagingBuffer,
vk::BufferImageCopy {
0, 0, 0,
{ vk::ImageAspectFlagBits::eColor, 0, 0, 1 },
vk::Offset3D{}, image.extent,
});
});
gpu.queues.graphics.waitIdle();
}
We created destagingBuffer
with the required image data size (= (size of a texel) * (image width) * (image height)), and pass allocationCreateInfo
as vku::allocation::hostRead
constant, which is the constant that is predefined by
vma::AllocationCreateInfo {
vma::AllocationCreateFlagBits::eHostAccessRandom | vma::AllocationCreateFlagBits::eMapped,
vma::MemoryUsage::eAuto,
}
. It indicates that the buffer will be read by the CPU later.
Warning
vku::MappedBuffer::MappedBuffer(vma::Allocator, const vk::BufferCreateInfo&, const vma::AllocationCreateInfo&)
constructor's last allocation create info parameter is default by
vma::AllocationCreateInfo {
vma::AllocationCreateFlagBits::eHostAccessSequentialWrite | vma::AllocationCreateFlagBits::eMapped,
vma::MemoryUsage::eAuto,
}
Which means omitting the last parameter for host reading may cause the significant performance degradation (especially on the mobile hardware).
Caution
DO NOT pass the non-mappable allocation create flags. If you're passing the non-default allocationCreateInfo
, its flags
must always contains vma::AllocationCreateFlagBits::eMapped
and either one of vma::AllocationCreateFlagBits::eHostAccessSequentialWrite
or vma::AllocationCreateFlagBits::eHostAccessRandom
. Otherwise, it cannot be mapped, which violates its naming.
After the buffer creation, we added the command buffer recording code that changes the image layout for transfer source and copies the image data into the buffer. don't forget to add your image usage flag with vk::ImageUsageFlagBits::eTransferSrc
.
After the queue gets idle, the de-staged data gets visible at the host side. Now we can persist the linearly flattened image data by image file, using third party library like stb-image
.
vcpkg.json
{
"dependencies": [
"vku",
"stb"
]
}
CMakeLists.txt
...
# ----------------
# External dependencies.
# ----------------
find_package(Stb REQUIRED)
find_package(vku CONFIG REQUIRED)
# ----------------
# Project executables.
# ----------------
add_executable(vku-tutorial main.cpp impl.cpp) # impl.cpp file added to prevent the duplicated stb_image compilation.
target_include_directories(vku-tutorial PRIVATE ${Stb_INCLUDE_DIR})
target_link_libraries(vku-tutorial PRIVATE vku::vku)
target_compile_definitions(vku-tutorial PRIVATE
COMPILED_SHADER_DIR="${CMAKE_CURRENT_SOURCE_DIR}/shaders"
)
main.cpp
#include <stb_image_write.h>
...
int main() {
...
gpu.queues.graphics.waitIdle();
stbi_write_png("output.png", image.extent.width, image.extent.height, 4, destagingBuffer.data, blockSize(image.format) * image.extent.width);
}
impl.cpp
#define STB_IMAGE_WRITE_IMPLEMENTATION
#include <stb_image_write.h>
After the application launch, you can find the output.png
file in the executable folder.
In the next tutorial, we'll render this triangle into the actual window (using GLFW), and how to effectively manage the multiple attachment images with vku. Stay tuned!
Here's the full code of the tutorial.
CMakeLists.txt
cmake_minimum_required(VERSION 3.30)
set(CMAKE_EXPERIMENTAL_CXX_IMPORT_STD "0e5b6991-d74f-4b3d-a41c-cf096e0b2508")
project(vku-tutorial LANGUAGES CXX)
set(CMAKE_CXX_STANDARD 23)
set(CMAKE_CXX_MODULE_STD 1)
# ----------------
# External dependencies.
# ----------------
find_package(Stb REQUIRED)
find_package(vku CONFIG REQUIRED)
# ----------------
# Project executables.
# ----------------
add_executable(vku-tutorial main.cpp impl.cpp)
target_include_directories(vku-tutorial PRIVATE ${Stb_INCLUDE_DIR})
target_link_libraries(vku-tutorial PRIVATE vku::vku)
target_compile_definitions(vku-tutorial PRIVATE
COMPILED_SHADER_DIR="${CMAKE_CURRENT_SOURCE_DIR}/shaders"
)
vcpkg.json
{
"dependencies": [
"vku",
"stb"
]
}
shaders/triangle.vert
#version 460
const vec2 positions[] = {
{ -0.5, 0.5 },
{ 0.5, 0.5 },
{ 0.0, -0.5 },
};
const vec3 colors[] = {
{ 1.0, 0.0, 0.0 },
{ 0.0, 1.0, 0.0 },
{ 0.0, 0.0, 1.0 },
};
layout (location = 0) out vec3 fragColor;
void main(){
fragColor = colors[gl_VertexIndex];
gl_Position = vec4(positions[gl_VertexIndex], 0.0, 1.0);
}
shaders/triangle.frag
#version 450
layout (location = 0) in vec3 fragColor;
layout (location = 0) out vec4 outColor;
void main(){
outColor = vec4(fragColor, 1.0);
}
main.cpp
#include <stb_image_write.h>
import std;
import vku;
struct QueueFamilies {
std::uint32_t graphics;
explicit QueueFamilies(vk::PhysicalDevice physicalDevice)
: graphics { vku::getGraphicsQueueFamily(physicalDevice.getQueueFamilyProperties()).value() } { }
};
struct Queues {
vk::Queue graphics;
Queues(vk::Device device, const QueueFamilies &queueFamilies)
: graphics { device.getQueue(queueFamilies.graphics, 0) } { }
[[nodiscard]] static auto getCreateInfos(vk::PhysicalDevice, const QueueFamilies &queueFamilies) noexcept {
return vku::RefHolder {
[=](const float &priority) {
return vk::DeviceQueueCreateInfo {
{},
queueFamilies.graphics,
vk::ArrayProxyNoTemporaries(priority),
};
},
1.f,
};
}
};
int main() {
const vk::raii::Context context;
const vk::raii::Instance instance { context, vk::InstanceCreateInfo {
#if __APPLE__
vk::InstanceCreateFlagBits::eEnumeratePortabilityKHR,
#else
{},
#endif
vku::unsafeAddress(vk::ApplicationInfo {
"Hello Triangle", 0,
nullptr, 0,
vk::makeApiVersion(0, 1, 0, 0),
}),
vku::unsafeProxy<const char*>("VK_LAYER_KHRONOS_validation"),
#if __APPLE__
vku::unsafeProxy({
vk::KHRGetPhysicalDeviceProperties2ExtensionName,
vk::KHRPortabilityEnumerationExtensionName,
}),
#endif
} };
const vku::Gpu<QueueFamilies, Queues> gpu { instance, vku::Gpu<QueueFamilies, Queues>::Config {
.verbose = true,
.deviceExtensions = {
vk::KHRMaintenance2ExtensionName,
vk::KHRCreateRenderpass2ExtensionName,
vk::KHRDepthStencilResolveExtensionName,
vk::KHRMultiviewExtensionName,
vk::KHRDynamicRenderingExtensionName,
#if __APPLE__
vk::KHRPortabilitySubsetExtensionName,
#endif
},
.devicePNexts = std::tuple {
vk::PhysicalDeviceDynamicRenderingFeatures { true },
},
} };
const vku::AllocatedImage image { gpu.allocator, vk::ImageCreateInfo {
{},
vk::ImageType::e2D,
vk::Format::eR8G8B8A8Unorm,
vk::Extent3D { 512, 512, 1 },
1, 1,
vk::SampleCountFlagBits::e1,
vk::ImageTiling::eOptimal,
vk::ImageUsageFlagBits::eColorAttachment | vk::ImageUsageFlagBits::eTransferSrc,
} };
const vk::raii::ImageView imageView { gpu.device, image.getViewCreateInfo() };
const vk::raii::PipelineLayout pipelineLayout { gpu.device, vk::PipelineLayoutCreateInfo{} };
const vk::raii::Pipeline pipeline { gpu.device, nullptr, vk::StructureChain {
vku::getDefaultGraphicsPipelineCreateInfo(
vku::createPipelineStages(
gpu.device,
vku::Shader::fromSpirvFile(COMPILED_SHADER_DIR "/triangle.vert.spv", vk::ShaderStageFlagBits::eVertex),
vku::Shader::fromSpirvFile(COMPILED_SHADER_DIR "/triangle.frag.spv", vk::ShaderStageFlagBits::eFragment)).get(),
*pipelineLayout, 1)
.setPRasterizationState(vku::unsafeAddress(vk::PipelineRasterizationStateCreateInfo {
{},
false, false,
vk::PolygonMode::eFill,
vk::CullModeFlagBits::eNone, {},
{}, {}, {}, {},
1.f,
}))
.setPViewportState(vku::unsafeAddress(vk::PipelineViewportStateCreateInfo {
{},
vku::unsafeProxy(vku::toViewport(vku::toExtent2D(image.extent))),
vku::unsafeProxy(vk::Rect2D { { 0, 0 }, vku::toExtent2D(image.extent) }),
}))
.setPDynamicState(vku::unsafeAddress(vk::PipelineDynamicStateCreateInfo{})),
vk::PipelineRenderingCreateInfo {
{},
image.format,
},
}.get() };
const vku::MappedBuffer destagingBuffer { gpu.allocator, vk::BufferCreateInfo {
{},
blockSize(image.format) * image.extent.width * image.extent.height,
vk::BufferUsageFlagBits::eTransferDst,
}, vku::allocation::hostRead };
const vk::raii::CommandPool graphicsCommandPool { gpu.device, vk::CommandPoolCreateInfo { {}, gpu.queueFamilies.graphics } };
vku::executeSingleCommand(*gpu.device, *graphicsCommandPool, gpu.queues.graphics, [&](vk::CommandBuffer cb) {
cb.pipelineBarrier(
vk::PipelineStageFlagBits::eTopOfPipe, vk::PipelineStageFlagBits::eColorAttachmentOutput,
{}, {}, {},
vk::ImageMemoryBarrier {
{}, vk::AccessFlagBits::eColorAttachmentWrite,
{}, vk::ImageLayout::eColorAttachmentOptimal,
vk::QueueFamilyIgnored, vk::QueueFamilyIgnored,
image, vku::fullSubresourceRange(),
});
cb.beginRenderingKHR({
{},
{ { 0, 0 }, vku::toExtent2D(image.extent) },
1,
0,
vku::unsafeProxy(vk::RenderingAttachmentInfo {
*imageView, vk::ImageLayout::eColorAttachmentOptimal,
{}, {}, {},
vk::AttachmentLoadOp::eClear, vk::AttachmentStoreOp::eStore, vk::ClearColorValue { 0.f, 0.f, 0.f, 0.f },
}),
}, *gpu.device.getDispatcher());
cb.bindPipeline(vk::PipelineBindPoint::eGraphics, *pipeline);
cb.draw(3, 1, 0, 0);
cb.endRenderingKHR(*gpu.device.getDispatcher());
cb.pipelineBarrier(
vk::PipelineStageFlagBits::eColorAttachmentOutput, vk::PipelineStageFlagBits::eTransfer,
{}, {}, {},
vk::ImageMemoryBarrier {
vk::AccessFlagBits::eColorAttachmentWrite, vk::AccessFlagBits::eTransferRead,
vk::ImageLayout::eColorAttachmentOptimal, vk::ImageLayout::eTransferSrcOptimal,
vk::QueueFamilyIgnored, vk::QueueFamilyIgnored,
image, vku::fullSubresourceRange(),
});
cb.copyImageToBuffer(
image, vk::ImageLayout::eTransferSrcOptimal,
destagingBuffer,
vk::BufferImageCopy {
0, 0, 0,
{ vk::ImageAspectFlagBits::eColor, 0, 0, 1 },
vk::Offset3D{}, image.extent,
});
});
gpu.queues.graphics.waitIdle();
stbi_write_png("output.png", image.extent.width, image.extent.height, 4, destagingBuffer.data, blockSize(image.format) * image.extent.width);
}
impl.cpp
#define STB_IMAGE_WRITE_IMPLEMENTATION
#include <stb_image_write.h>